You can do it over ssh with authentication. That said: - X is burdened by this a...

jacquesm · on June 29, 2019

This would, for instance remove the possibility to run an interactive program on a CUDA server mounted in a rack in the server room while displaying the results on a workstation without the likes of 'VNC' to screw up your display and/or huge latency.

saltcured · on June 29, 2019

But, X doesn't really do that, assuming you mean OpenGL running on the GPU. If you did forward an X app on that server to your desktop and managed to get OpenGL to work, it would use GLX to forward OpenGL commands to your GPU on your workstation and completely ignore the server GPGPU hardware.

What I think you are describing is closer to what Wayland plus a remote display protocol might enable. Let the app use its beefy GPGPU hardware on the server and just blast the 2D frames to the workstation on each OpenGL buffer-swap, to be composited into the local framebuffer that drives the actual display. All the expensive texture-mapping, geometry calculation, and pixel shading would execute on the remote server.

jacquesm · on June 29, 2019

> But, X doesn't really do that, assuming you mean OpenGL running on the GPU.

No, I did not mean that.

saltcured · on June 29, 2019

I think this discussion is probably petering out, and unfortunately we're talking past each other somehow. This is probably my last time revisiting this article and its comments, but I thought I'd give one last effort to expose the assumptions that may be at the heart of our disagreement...

You have mentioned multiple times terrible performance and latency, but I feel this is a strawman argument. People are streaming HD and 4K video over the internet all the time now and watching live-streamed video games, where someone is playing a GPU-intensive application and having the output encoded to an efficient video compression stream in real-time. These compression algorithms are also being used in practice for video teleconferencing solutions all over the place, by many vendors. The latencies are low enough to have normal conversation, movement and speech. They work reasonably well over consumer-level, low performance WAN connections and can work flawlessly over faster WAN or LAN paths. I don't understand how this contemporary scene suggest remote display is impractical.

Also, please don't get caught up in the VNC-style remote desktop metaphor as the only way to achieve remote display. An X-over-SSH styled remote protocol could just as easily be designed to launch applications on a remote host, where it is able to allocate server-side GPU hardware resources and run a renderer off-screen which buffer-swaps directly into a video codec as the sink for new frames, pushing those compressed frames to the user-facing system where they are displayed as the content of one application window. There is no need to conflate remote-display with having an actual screen or desktop session active on the host running the application. This is much like SSH itself giving us any number of pseudo TTY shell sessions without requiring any real serial port or console activity on the remote server.

If you didn't mean that the "CUDA" hardware in the server is doing rendering (like in Google Stadia) then I am guessing you mean that the application is doing CUDA data processing as part of its application logic before generating separate display commands to a remote X server to actually visualize the results. I am having a really hard time understanding why anybody would want to split and distribute an application like this, because every such data-intensive visualization I am aware of has much tighter coupling between those application logic bits and the renderer than it does between the renderer's output buffers and the photon-emitting display screen. This is why there are extensions, e.g. in OpenCL and OpenGL, to share GPGPU buffers between compute and rendering worlds. The connection between the application data-processing and the rendering can be kept local to the GPGPU hardware and never even traverse the PCIe bus in the host. Massive data-reduction usually happens during rendering (projection, cropping, occlusion, and resampling) and the output imagery is much smaller than the renderer state updates and draw-commands which contribute to a frame.

For what it's worth, I do know of cases where remote display is impractical due to latency, but these are also cases where remote rendering are impractical. For example, an immersive head-mounted display may require a local renderer to have extremely low latency for head-tracking sensors to feed back into the renderer as camera geometry so rendered imagery matches head movement. A remote rendering protocol would be just as impractical here, because the rendering command stream that embodies this updated camera and frame refresh would also have too much latency to accomplish the task. The only practical way to distribute such an application would be to introduce an application-specific split between the local immersive rendering application, placed near the user, and other less latency-sensitive simulation and world-modeling which can be offloaded to the remote server. A rendering protocol like X doesn't provide the right semantics for the asynchronous communications that have to happen between these two stages. Instead, you need many of the features reflected in a distributed, multiplayer game: synchronizing protocols to exchange agent/object behavior updates, local predictive methods to update the local renderer's model in the absence of timely global updates, and a data-library interface to discover and load portions of the remote simulation/model into the renderer on demand based on user activity and resource constraints in the local rendering application.