I think working on that kind of system would be actively harmful for most programmers. It would give them a totally unbalanced intuition for what the appropriate tradeoff between memory consumption and other attributes (maintainability, defect rate, ...) is. If anything, programmers should learn on the kind of machine that will be typical for most of their career - which probably means starting with a giant supercomputing cluster to match what's going to be in everyone's pocket in 20 years' time.