Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder if someone has worked on automatic translation of Excel spreadsheets with formulas to a proper programming language. Using the Python openpyxl package, you can read an Excel spreadsheet and get the contents of each cell, either a value or a formula. Having stored this data, it should be possible to

(1) look for changes in formulas, which could indicate an error. Why is the formula for cell A99 different from that in A100?

(2) write formulas in Python/NumPy or another language that replicates the formulas.



It is a proper programming language.

For #1, there has been research on linting and automated semantic error or smell detection in Excel (e.g., https://github.com/ExceLint/ExceLint). Some features are live in Excel today!

For #2, Excel now supports Python.

I don't think you need to do any conversions, all of the necessary information exists in the spreadsheet.


> It is a proper programming language.

How do you define a reusable function?

---

(I know the answer....it's a proper programming language.)



Lambda support in Excel is quite new though, isn’t it? If the argument is “Excel is a proper programming language because it now supports lambdas”, that implies that for most of its existence it hasn’t been one


We've had VLOOKUP for a while.


That's a dictionary, not a general function.


it works well enough in discrete domains... it's literally the definition of a function - map something to something else.


Yes, a function is an injection [1].

Abstractly.

Listing all possible inputs with their outputs, and then doing a lookup....that's only sometimes useful.

[1] https://en.wikipedia.org/wiki/Injective_function


If we are talking Excel 365 after Feb 2022, then yes.

If we're talking Excel 2021, Excel 2019, etc. then no.


So Excel supports reusable Lambdas then? In the same way that many tools introduce new features and users need to upgrade versions to get them.


Yep, Excel 365 Version 2201+ support reusable lambdas.



VBA


Definitely has me wondering: there are lots of regex compilers out there, so where are the spreadsheet compilers? I should be able to convert a spreadsheet wholesale to Java or c# that clearly expresses the cell relationships, then go in and add names (if they aren't already there)


I know of some Python libraries that do this:

  https://github.com/bradbase/xlcalculator
  https://github.com/dgorissen/pycel
  https://github.com/vinci1it2000/formulas
You give it an .xlsx file and it compiles to a Python representation where you can mutate "cells" and observe calculated results in other "cells", according to the formula DAG. Not sure about other languages but I'd be surprised if Python were the only one to have libraries like this.

Funfact: it's more complicated than you think to implement these engines. There's no spec for the formulae, so you have to reverse-engineer based on observed behaviour. You run into a lot of strange edge cases in seemingly mundane arithmetic formulae, and type conversions are a constant source of surprise.


20 years ago, I was told there was such a beast for turning an Excel spreadsheet into a C# application, targeted at accelerating/productionizing financial models. I never used it, though.


I am not sure if there exists a convertor, but..

Emacs org-mode has tables, which you can use as spreadsheets [1]. The formulas are not written in the cells, but after the table, with one formula per line. You have to update the table every time you update the data or the formulas.

I once tried to use it, but it was way too difficult, and I went back to Libreoffice.

[1] https://orgmode.org/manual/The-Spreadsheet.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: