Hacker Newsnew | past | comments | ask | show | jobs | submit | jamesblonde's commentslogin

If they are exploding categorical variables using OHE and storing the columns - that is the wrong thing to do. You should only ever store untransformed feature data in tables. You apply the feature transformations, like OHE, on reading from the tables, as those transformations are parameterized by the data you read (the training data subset you select).

Reference: https://www.hopsworks.ai/post/a-taxonomy-for-data-transforma...


And the Wallenberg family.

Subsidiarity has been a key building block of the EU and has failed the EU for unexpected reasons. Subsidiarity was pursued for accountability and to make the EU less centralized - decisions should be made at the lowest, most local level possible, with central authorities only stepping in when a task cannot be effectively handled locally. However, it means that here in Sweden govt bodies are all individually moving to Azure, because each one makes that local decision in their best interest. The same thing has happened all over the EU - and very few govt bodies would ever take the risk of investing in using EU cloud or data platforms. We need public procurement to help kickstart life into the Eurostack.

I will be servers as well. Eurostack cloud providers. We are involved in one of these - a large car company doing the same.


They control Europe's digital infrastructure and are able to increase rent to usurous levels (tarrifs!) because Europe is dependent on their digital services. Without digital sovereignty, Europe has no sovereignty and will quickly become a modern colony from which wealth will be extracted.


The reason the US is able to raise rents (tariffs) has nothing to do with Europe buying US digital services.

The tariffs are on European exports. The problem is Europe has a weak domestic consumer market and is dependent on selling stuff to the US, not buying from them.


The EU has a services deficit compared to the US, the US has a goods deficit compared to Europe. Together, they are almost in balance, the difference is just 3% of total trade [1]. Put differently, the US and the EU need each other. This is why Trump is using footguns.

[1] https://policy.trade.ec.europa.eu/eu-trade-relationships-cou...


The problem is really that Europe has a few dozen weak consumer markets. If there really was a proper single market, I suspect the EU would be much more competitive in digital services.

Unfortunately despite their best efforts this isn't something Eurocrats can simply will into existence. The most important prerequisite is a common language, and there is zero political will to do the only sensible thing and establish English as the official common language of the EU.


Nonsense. Unilaterial tarrifs are not how trade agreements work. This is pure extractive rent.

The reason the US is not able to extract the same rents from China is that they have digital sovereignty and the US cannot just pull the cloud plug from them.


> Nonsense. Unilaterial tarrifs are not how trade agreements work. This is pure extractive rent.

What do you mean by "unilateral tariffs"?

> The reason the US is not able to extract the same rents from China is that they have digital sovereignty and the US cannot just pull the cloud plug from them.

The US has higher tariffs against Chinese imports than European imports.


I agree that this is an anti-pattern for training. In training, you are often I/O bound over S3 - high b/w networking doesn't fix it (.saftensor files are typically 4GB in size). You need NVMe and high b/w networking along with a distributed file system.

We do this with tiered storage over S3 using HopsFS that has a HDFS API with a FUSE client, so training can just read data (from HopsFS datanode's NVMe cache) as if it is local, but it is pulled from NVMe disks over the network. In contrast, writes go straight to S3 vis HopsFS write-through NVMe cache.


I have one (top of the line!). Here's how bad the engineers were. For the last 6 months, the device emits 10 audible beeps every 6 hours. I do a lot of customer meetings and public speaking. People would sometimes ask - "what is that noise"? I would say "No idea, but if you wait 8 seconds, it will stop"!

Also, my heart rate would sometimes drop below 40 bpm. Then it would start pacing, which i didn't want and was extremely uncomfortable.

p.s., the reason the battery ran out was because i found a treatment for my condition that works really well through talking globally to experts (i am a computer scientist). I wrote a case study paper about my condition to help others, co-authored by my doctors. https://www.slideshare.net/slideshow/arvc-and-flecainide-cas... 16 years later, the device is still in place, but I will have it removed early next year.


Anything sovereign AI or whatever is gone immediately when the mods wake up. Got an EU cloud article? Publish it at 11am CET, it's disappears around 12.30.


See, Peter Thiel is smart. There are enough idiots who will buy his shtick - it's not just maga who get pointed in the direction he wants society to go (serfdom).


Cloudflare tried to build their own feature store, and get a grade F.

I wrote a book on feature stores by O'Reilly. The bad query they wrote in Clickhouse could have been caused by another more error - duplicate rows in materialized feature data. For example, in Hopsworks it prevents duplicate rows by building on primary key uniqueness enforcement in Apache Hudi. In contrast, Delta lake and Iceberg do not enforce primary key constraints, and neither does Clickhouse. So they could have the same bug again due to a bug in feature ingestion - and given they hacked together their feature store, it is not beyond the bounds of possibility.

Reference: https://www.oreilly.com/library/view/building-machine-learni...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: