Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's amazing how we are watching use cases for notebooks and spreadsheets converging. I wonder what the killer feature will be to bring a bigger chunk of the Excel world into a programmatic mindset... Or alternatively, whether we will see notebook UIs embedded in Excel in the future in place of e.g. VBA.


From my experience, I think that this is unlikely. A notebook abstracts a REPL and allows for literate programming. It's mostly used by people who used to write computer programs to carry out analyses (e.g., applied statisticians or numerical mathematicians). A notebook gives them the opportunity to explain the code block, include equations, etc.

A spreadsheet allows for semi-automation of data processing. Each cell can have a (rather simple) function defined and its evaluation result is printed in that cell. You can actually build up pretty complex workflows by just concatenating cell evaluations.

To give a more concrete example, think about a loop. It is arguably the building block of any computer programming language and a necessary cornerstone in learning how to program. Both notebooks and spreadsheets implement loops differently. You can code a loop in a notebook but the cell output will be difficult to interpret (think of it having to fit a linear model for 5 different outcomes). You would be better off just splitting up the cell and run the models separately. That will allow for commenting the code and explaining the results, just like you would in writing a paper. In a spreadsheet, you would define a function, then copy/paste it for the cells you want it evaluated for. No programming required, just knowledge of how to link to cells from within a function and how to copy/paste in the spreadsheet. That's why spreadsheets are wildly used by non-technical people with little knowledge of computer programming.


I've used a lot of both. Excel is much better for most tasks where you have < 1mil rows of data. Its easier to look at the data, easier for novices and fast enough. Just being able to scroll through the data is very valuable just to get a feel for it. The biggest drawback is VBA, if you could write excel macros in Python it would be a hit.

If you have more data, Notebooks can handle that better. However I've noticed lots of colleagues skipping notebooks and using IDEs instead. Much easier to work with and better for scm. I'm not a huge fan of notebooks any more.


> most tasks where you have < 1mil rows of data

Now that's quite a generalization... most tasks? If all you're doing is a=b+c then perhaps. I work in HFT and even for trivial data exploration I would never even consider touching Excel; why would I? Even if it's just 100 rows. No thanks. Once you're comfortable with Python / its scientific stack, the exploratory part of data analysis becomes fast and trivial.

What I would like to see is notebooks becoming more IDE-like. This is already happening gradually, e.g. with JupyterLab replacing Jupyter notebooks ([1])

[1] https://jupyterlab.readthedocs.io/en/stable/


As seen in the post, Finance has caught on to this idea. The Bloomberg Terminal now provides BQuant. It's an almost fully functional IPython notebook with built in access to their financial datasets.

Analysts that used to work in excel are moving their models into environments like these. Libraries for most common functionality are provided, and allow someone with only a bit of VBA knowledge to feel comfortable enough to start working with python.

And when you browse places like r/financialcareers, it's filled with finance students wondering which programming languages they should learn. And the answer is always to learn python using jupyter notebooks.


That’s not a bad idea. Spreadsheets are pure functional languages built that use literal spaces instead of namespaces.

Notebooks are cells of logic. You could conceivably change the idea of notebook cells to be an instance of a function that points to raw data and returns raw data.

Perhaps this just Alteryx though


This is brilliant.

I'm picturing the ability to write a Python function with the parameters being just like the parameters in an Excel function. You can drag the cell and have it duplicated throughout a row, updating the parameters to correspond to the rows next to it.

It would exponentially expand the power of excel. I wouldn't be limited to horribly unmaintainable little Excel functions.

VBA can't be used to do that, can it? As far as I understand (and I haven't investigated VBA too much) VBA works on entire spreadsheets.

Essentially, replace the excel formula `=B3-B4` with a Python function `subtract(b3, b4)` where Subtract is defined somewhere more conveniently (in a worksheet wide function definition list?).


You can build user defined functions in Excel with VBA as well as with Python through something like xlwings. One of the issues that I ran into with xlwings (or any third party integration into the Office suite) is portability between users.

The ubiquity of Excel is both a blessing and a curse in that everyone has it, so everyone uses it, regardless of whether or not it is the best tool for the job.


Google Colaboratory is now Ubiquitous in the sense you use the term, as is Microsoft Azure Notebooks, so the Ubiquity argument is no longer unique to Excel. The big argument in favor of notebooks is transparency and the breadth of tools that they can make use of. Economists will increasingly move away from Excel as the QuantEcon website demonstrates. Perhaps accountants will still uses spreadsheets, after all they invented them, but it's unclear why anyone else really needs them when there are better tools available.


This would require a reactive recomputing of cells to be anything like a spreadsheet. > Essentially, replace the excel formula `=B3-B4` with a Python function `subtract(b3, b4)`

as of now jupyter/ipython would not recompute `subtract(b3, b4)` if you change b3 or b4, this has positive and negative (reliance on hidden state and order of execution) effects.

I too would really like something like this, but I think it is pretty far away from where jupiter is now.


You can build something like this with Jupyter today.

> Traitlets is a framework that lets Python classes have attributes with type checking, dynamically calculated default values, and ‘on change’ callbacks. https://traitlets.readthedocs.io/en/stable/

> Traitlet events. Widget properties are IPython traitlets and traitlets are eventful. To handle changes, the observe method of the widget can be used to register a callback https://ipywidgets.readthedocs.io/en/stable/examples/Widget%...

You can definitely build interactive notebooks with Jupyter Notebook and JupyterLab (and ipywidgets or Altair or HoloViews and Bokeh or Plotly for interactive data visualization).

> Qgrid is a Jupyter notebook widget which uses SlickGrid to render pandas DataFrames within a Jupyter notebook. This allows you to explore your DataFrames with intuitive scrolling, sorting, and filtering controls, as well as edit your DataFrames by double clicking cells. https://github.com/quantopian/qgrid

Qgrid's API includes event handler registration: https://qgrid.readthedocs.io/en/latest/

> neuron is a robust application that seamlessly combines the power of Visual Studio Code with the interactivity of Jupyter Notebook. https://marketplace.visualstudio.com/items?itemName=neuron.n...

"Excel team considering Python as scripting language: asking for feedback" (2017) https://news.ycombinator.com/item?id=15927132

OpenOffice Calc ships with Python 2.7 support: https://wiki.openoffice.org/wiki/Python

Procedural scripts written in a general purpose language with named variables (with no UI input except for chart design and persisted parameter changes) are reproducible.

What's a good way to review all of the formulas and VBA and/or Python and data ETL in a spreadsheet?

Is there a way to record a reproducible data transformation script from a sequence of GUI interactions in e.g. OpenRefine or similar?


OpenRefine/OpenRefine/wiki/Jupyter

"Within the Python context, a Python OpenRefine client allows a user to script interactions within a Jupyter notebook against an OpenRefine application instance, essentially as a headless service (although workflows are possible where both notebook-scripted and live interactions take place. https://github.com/OpenRefine/OpenRefine/wiki/Jupyter

Are there data wrangling workflows that are supported by OpenRefine but not Pandas, Dask, or Vaex?


This interesting need to have a closer look, possibly refine can be more efficient? But haven't used it enough to know, just payed around with it a bit. Didn't realise you could combine it with jupyter.


Notebooks are very much based on linear order and side-effects, though (while Excel is, as you say, a purely functional DAG).

Not sure how they can converge.


Two types of cells. One type defines functions. Another are instances of those functions that require inputs. Arrange them in click n drag floaty space. Cells of type one exist in a name space. Cells of type two exist in spacey space.

This kind of thing exists at a larger scale for pipeline visualization. I could see it working for notebooks.


> Notebooks are very much based on linear order and side-effects

Try https://beta.observablehq.com/


The ease of the calculation tree in Excel versus having to keep track of what cells in a notebook you have updated was a large part of why we built and open-sourced Loman [1]. It's a computation graph that keeps track of state as you update data or computation functions for nodes. It also ends up being useful for real-time interfaces, where you can just drop what you need at the top of a computation graph and recalculate what needs updating, and also for batch processes where you can serialize the entire graph for easy debugging of failures (there are always eventually failures). We also put together some examples relevant to finance [2]

[1] https://loman.readthedocs.io/en/latest/user/quickstart.html

[2] https://github.com/janushendersonassetallocation/loman/tree/...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: