Hacker Newsnew | past | comments | ask | show | jobs | submit | gallabytes's commentslogin

I use it as a fancy monitor that I can strap to my face and fits in a suitcase. sometimes I want to work lying down and it's great for that. It being based on ipados makes it kinda useless aside from the display.


How do you type when lying down?


Not OP, but I type while lying down in bed. I don't have the AVP, but I do have xreal Air AR glasses that basically mirror my macbook pro's screen. I rest my Apple keyboard across my crotch/upper thighs, wherever my arms are most comfortable. To my right (on the bed) is my Apple Trackpad. It's kind of a hastle to move my hand from keyboard to trackpad, but I can drive almost all functions on my computer through the keyboard, so it's only used as a last resort. I also have airpods in my ears because the laptop is closed and on the floor.

I do this when my sciatica pain is preventing me from sitting and I'm tired of standing at my desk. I can work lying down for up to a couple of hours and find the position to be highly comfortable and productive.


Not the OP but you can do it with a split keyboard


Could have been a typo.


> Some of this complexity may be necessary for achieving optimal performance in Jax. E.g. extra indirection to avoid the compiler making some bad fusion decision, or multiple calls so something can be marked as static for the jit in the outer call

certainly some of it is but not the lion's share - I have a much simpler (private) codebase which scales pretty similarly afaict.

the complexity of Maxtext feels more Serious Engineering ™ flavored, following Best Practices.


this is not even close to true


Yeah, I'm sorry for the incorrect statement.

MidJourney claims 3.3 hours of GPU for $10. If that's 3.3 hours of A100, at a rate of $2/hour that's $6.6 of GPU costs.

My original statement was based on MJ feeling very expensive per image compared to the huge number of images I can generate in an hour with SDXL on my 3090 (and 3090s can be rented for $0.20 an hour).

But I forgot how overpriced A100s are (I doubt MJ is running 3090s but that'd be pretty cool), and that MJ is probably 4x the size of SDXL (although surely more optimized).

My revised statement is that for their $10 plan there's a few bucks of GPU compute cost. Probably between 2x and 5x margins.


A100s aren’t $2/h at MJ scale tho. Firstly they likely own quite a few and they’re relatively cheap now at $4k a card, and secondly you can get much better deals than $2/h if you rent a large amount and pre-pay.


Sure, I was establishing a upper and lower bound. That's why I said between 2x and 5x margins.


I literally just don't feel like running them tbh, and see no reason to publish them either way. Mostly prefer to let the outputs speak for themselves.

For a while I was using an FID variant for evaluation during training, but didn't find it very helpful vs just looking at output images.


Okay. That's probably the difference between a commercial and a research project.


lmao no. imagine trying to do text to image with anything other than deep learning. nothing else comes close.


We have artists for that. I heard they can compete with sota methods.


OP was about classical statistical techniques. I'm pretty sure human artists are not logistic regression?


Compete on quality, certainly not on price or speed.


yeah we trained v5 on TPUs and continue to train on them.


We didn't, v5 was trained on TPUs too


... no it definitely wasn't. that's $50m. read the paper, they tell you how long it took on a v4-256, which you know the public rental price for.


Pretty much everybody who trains models trains tens to hundreds of models before their final run. But a better way to think about this is: Google spent money to develop the TPUv4, then paid money to have a bunch built, and they're sitting in data centers, either consuming electricity or not. Google has clearly decided the value of these results exceeds the amortized cost of running the TPU fleet (which is an enormous expense).


While I don't agree with gp's price estimate of $50M, you are also forgetting that to train the final model, they had to iterate on the model for research and development. This imposes a significant multiple over the cost to just train the final model.


they're talking about the total cost of the project, including the salaries of the humans, their health care, their office space, etc.


And where did you think the tagged dataset and software came from?


Marginal cost of using that is basically $0. The internal dataset is a sunk cost that's already been paid for (presumably for their other, revenue generating products like Google Images). Half of their dataset is a publicly available one.


One order of magnitude over 500k it's 5M, two orders is 50M


That wouldn't make sense - compression has very cache-friendly access patterns, and would benefit greatly from the observed improvements in memory bandwidth.


That surprises me to hear. I would expect it to jump around in ram a lot. And at higher compression settings, some compressors use a lot of ram, more than will fit in cache.


SIMD - compressing has gotten faster, but (assuming OP is correct rather than just missing info) the reference algorithm didn't have room to take advantage of SIMD. The relevant improvements since 2010 or so mostly look like bandwidth improvements not latency, and coincide with increasing ubiquity of SIMD instructions and SIMD-friendly algorithms.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: