Project Genie: Experimenting with infinite, interactive worlds
Posted by meetpateltech 6 hours ago
Comments
Comment by jlhawn 1 hour ago
In this view, we are essentially living inside a high-fidelity generative model. Our brains are constantly 'hallucinating' a predicted reality based on past experience and current goals. The data from our senses isn't the source of the image; it's the error signal used to calibrate that internal model. Much like Genie 3 uses latent actions and frames to predict the next state of a world, our brains use 'Active Inference' to minimize the gap between what we expect and what we experience.
It suggests that our sense of 'reality' isn't a direct recording of the world, but a highly optimized, interactive simulation that is continuously 'regularized' by the photons hitting our retinas.
Comment by psychoslave 5 minutes ago
It’s also easy to find this treated in various philosophy/religion through time and space. And anyway as consciousness is eager to project whatever looks like a possible fit, elements of suggesting prior arts can be inferred back as far as traces can be found.
Comment by tracerbulletx 22 minutes ago
Comment by kingstoned 6 minutes ago
Comment by shagie 1 hour ago
Comment by cfiggers 27 minutes ago
At what point does the processing become so strong that it's less a photograph and more a work of computational impressionism?
Comment by alastair 19 minutes ago
Comment by in-silico 4 hours ago
That is not the goal.
The purpose of world models like Genie is to be the "imagination" of next-generation AI and robotics systems: a way for them to simulate the outcomes of potential actions in order to inform decisions.
Comment by benlivengood 3 hours ago
Comment by avaer 4 hours ago
The whole reason for LLMs inferencing human-processable text, and "world models" inferencing human-interactive video, is precisely so that humans can connect in and debug the thing.
I think the purpose of Genie is to be a video game, but it's a video game for AI researchers developing AIs.
I do agree that the entertainment implications are kind of the research exhaust of the end goal.
Comment by NitpickLawyer 4 hours ago
Yeah, I think this is what the person above was saying as well. This is what people at google have said already (a few podcasts on gdm's channel, hosted by Hannah Fry). They have their "agents" play in genie-powered environments. So one system "creates" the environment for the task. Say "place the ball in the basket". Genie creates an env with a ball and a basket, and the other agent learns to wasd its way around, pick up the ball and wasd to the basket, and so on. Pretty powerful combo if you have enough compute to throw at it.
Comment by in-silico 4 hours ago
When you simulate a stream of those latents, you can decode them into video.
If you were trying to make an impressive demo for the public, you probably would decode them into video, even if the real applications don't require it.
Converting the latents to pixel space also makes them compatible with existing image/video models and multimodal LLMs, which (without specialized training) can't interpret the latents directly.
Comment by SequoiaHope 4 hours ago
I think robots imagining the next step (in latent space) will be useful. It’s useful for people. A great way to validate that a robot is properly imagining the future is to make that latent space renderable in pixels.
[1] “By using features extracted from the world model as inputs to an agent, we can train a very compact and simple policy that can solve the required task. We can even train our agent entirely inside of its own hallucinated dream generated by its world model, and transfer this policy back into the actual environment.”
Comment by sailingparrot 2 hours ago
If you don't decode, how do you judge quality in a world where generative metrics are famously very hard and imprecise? How do you go about integrating RLHF/RLAF in your pipeline if you don't decode, which is not something you can skip anymore to get SotA?
Just look at the companies that are explicitly aiming for robotics/simulation, they *are* doing video models.
Comment by abraxas 2 hours ago
Soft disagree. What is the purpose of that imagination if not to map it to actual real world outfcomes. For this to compare them to the real world and possibly backpropagate through them you'll need video frames.
Comment by ACCount37 4 hours ago
I do wonder if I can frankenstein together a passable VLA using pretrained LTX-2 as a base.
Comment by thegabriele 4 hours ago
Comment by koolala 4 hours ago
Comment by thegabriele 4 hours ago
Comment by empath75 3 hours ago
Even if you just wire this output (or probably multiples running different counterfactuals) into a multimodal LLM that interprets the video and uses it to make decisions, you have something new.
Comment by oceanplexian 2 hours ago
As soon as this thing is hooked up to VR and reaches a tipping point with the general public we all know exactly what is going to happen. The creation of the most profitable, addictive and ultimately dystopian technology Big Tech has ever come up with.
Comment by ceejayoz 1 hour ago
Comment by pizzafeelsright 3 hours ago
I prefer real danger as living in the simulation is derivative.
Comment by whytaka 3 hours ago
Comment by slashdave 3 hours ago
You cannot invent data.
Comment by kingstnap 1 hour ago
This is a paper that recently got popular ish and discusses the counter to your viewpoint.
> Paradox 1: Information cannot be increased by deterministic processes. For both Shannon entropy and Kolmogorov complexity, deterministic transformations cannot meaningfully increase the information content of an object. And yet, we use pseudorandom number generators to produce randomness, synthetic data improves model capabilities, mathematicians can derive new knowledge by reasoning from axioms without external information, dynamical systems produce emergent phenomena, and self-play loops like AlphaZero learn sophisticated strategies from games
In theory yes, something like the rules of chess should be enough for these mythical perfect reasoners that show up in math riddles to deduce everything that *can* be known about the game. And similarly a math textbook is no more interesting than a book with the words true and false and a bunch of true => true statements in it.
But I don't think this is the case in practice. There is something about rolling things out and leveraging the results you see that seems to have useful information in it even if the roll out is fully characterizable.
Comment by slashdave 1 hour ago
What I object to are the "scaling maximalists" who believe that if enough training data were available, that complicated concepts like a world model will just spontaneously emerge during training. To then pile on synthetic data from a general-purpose generative model as a solution to the lack of training data becomes even more untenable.
Comment by whytaka 3 hours ago
If instead of a photo you have a video feed, this is one step closer to implementing subjective experience.
Comment by 2bitencryption 3 hours ago
Comment by slashdave 2 hours ago
Comment by echelon 4 hours ago
First of all, there are a variety of different types of world models. Simulation, video, static asset, etc. It's a loaded term, just as the use cases are widespread.
There are world models you can play in your browser inferred entirely by your CPU:
https://madebyoll.in/posts/game_emulation_via_dnn/ (my favorite, from 2022!)
https://madebyoll.in/posts/world_emulation_via_dnn/ (updated, in 3D)
There are static asset generating world models, like WorldLabs' Marble. These are useful for video games, previz, and filmmaking.
I wrote open source software to leverage marble for filmmaking (I'm a filmmaker, and this tech is extremely useful for scene consistency):
https://www.youtube.com/watch?v=wJCJYdGdpHg
https://github.com/storytold/artcraft
There are playable video-oriented models, many of which are open source and will run on your 3080 and above:
https://github.com/Robbyant/lingbot-world
There are things termed "world models" that really shouldn't be:
https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0
There are robotics training oriented world models:
https://github.com/leggedrobotics/robotic_world_model
Genie is not strictly robotics-oriented.
Comment by in-silico 3 hours ago
The other examples you've given are neat, but for players like Google they are mostly an afterthought.
Comment by echelon 3 hours ago
Gaming: $350B TAM
All media and entertainment: $3T TAM
Manufacturing: $5T TAM
Roughly the same story.
This tech is going to revolutionize "films" and gaming. The entire entertainment industry is going to transform around it.
When people aren't buying physical things, they're distracting themselves with media. Humans spend more time and money on that than anything else. Machines or otherwise.
AI impact on manufacturing will be huge. AI impact on media and entertainment will be huge. And these world models can be developed in a way that you develop exposure and competency for both domains.
edit: You can argue that manufacturing will boom when we have robotics that generalize. But you can also argue that entertainment will boom when we have holodecks people can step into.
Comment by in-silico 3 hours ago
Robots is also just one example. A hypothetically powerful AI agent (which might also use a world model) that controls a mouse and keyboard could replace a big chunk of white-collar work too.
Those are worth 10's of trillions of dollars. You can argue about whether they are actually possible, but the people backing this tech think they are.
Comment by dingnuts 2 hours ago
Comment by dyauspitr 4 hours ago
Comment by avaer 4 hours ago
It's not really as much of a boon as you'd think though, since throwing together a 3D model is not the bottleneck to making a sellable video game. You've had model marketplaces for a long time now.
Comment by echelon 3 hours ago
It is for filmmaking! They're perfect for constructing consistent sets and blocking out how your actors and props are positioned. You can freely position the camera, control the depth of field, and then storyboard your entire scene I2V.
Example of doing this with Marble: https://www.youtube.com/watch?v=wJCJYdGdpHg
Comment by avaer 1 hour ago
Marble definitely changes the game if the game is "move the camera", just most people would not consider that a game (but hey there's probably a good game idea in there!)
Comment by cyanydeez 2 hours ago
So, like, it's very important to understand the lineage of training and not just the "this is it"
Comment by ollin 5 hours ago
- https://youtu.be/15KtGNgpVnE?si=rgQ0PSRniRGcvN31&t=197 walking through various cities
- https://x.com/fofrAI/status/2016936855607136506 helicopter / flight sim
- https://x.com/venturetwins/status/2016919922727850333 space station, https://x.com/venturetwins/status/2016920340602278368 Dunkin' Donuts
- https://youtu.be/lALGud1Ynhc?si=10ERYyMFHiwL8rQ7&t=207 simulating a laptop computer, moving the mouse
- https://x.com/emollick/status/2016919989865840906 otter airline pilot with a duck on its head walking through a Rothko inspired airport
Comment by msabalau 35 minutes ago
https://www.youtube.com/watch?v=FyTHcmWPuJE
It's an experimental research prototype, but it also feels like a hint of the future. Feel free to ask any questions.
Comment by RaftPeople 4 hours ago
Comment by post-it 1 hour ago
Comment by echelon 3 hours ago
Ironically, he covered PixVerse's world model last week and it came close to your ask: https://youtu.be/SAjKSRRJstQ?si=dqybCnaPvMmhpOnV&t=371
(Earlier in the video it shows him live prompting.)
World models are popping up everywhere, from almost every frontier lab.
Comment by Valk3_ 2 hours ago
Comment by ollin 27 minutes ago
From a product perspective, I still don't have a good sense of what the market for WMs will look like. There's a tension between serious commercial applications (robotics, VFX, gamedev, etc. where you want way, way higher fidelity and very precise controllability), vs current short-form-demos-for-consumer-entertainment application (where you want the inference to be cheap-enough-to-be-ad-supported and simple/intuitive to use). Framing Genie as a "prototype" inside their most expensive AI plan makes a lot of sense while GDM figures out how to target the product commercially.
On a personal level, since I'm also working on world models (albeit very small local ones https://news.ycombinator.com/item?id=43798757), my main thought is "oh boy, lots of work to do". If everyone starts expecting Genie 3 quality, local WMs need to become a lot better :)
Comment by WarmWash 5 hours ago
Comment by abraxas 1 hour ago
Although that probably precludes her from having animations in those worlds...
Comment by nozbufferHere 4 hours ago
Comment by Legend2440 3 hours ago
>Genie 3’s consistency is an emergent capability. Other methods such as NeRFs and Gaussian Splatting also allow consistent navigable 3D environments, but depend on the provision of an explicit 3D representation. By contrast, worlds generated by Genie 3 are far more dynamic and rich because they’re created frame by frame based on the world description and actions by the user.
Comment by emmettm 1 hour ago
Comment by sfn42 4 hours ago
Comment by jsheard 4 hours ago
Comment by autonomousErwin 3 hours ago
Comment by dabbz 2 hours ago
Comment by krunck 4 hours ago
Comment by MillionOClock 4 hours ago
Comment by Sol- 1 hour ago
Perhaps better to roam a virtual reality than be starved in the real world.
Comment by avaer 4 hours ago
Comment by boogrpants 2 hours ago
Your neighbors in the street protesting for comprehensive single payer healthcare? Yeah they're perfectly fine leaving your existence up to "market forces".
Copy-paste office workers everywhere reciting memorized platitudes and compliance demands.
You're telling me I could interact even less with such selfish (and often useless given their limited real skillset) people? Deal.
America needs to rethink the compensation package if it wants to survive as a socio-political meme. Happy to call myself Canadian or Chinese if their offer is better. No bullets needed.
Comment by jplusequalt 2 hours ago
You have a dangerously low opinion of your fellow man, and while I sympathize with your frustration, I would humbly suggest you direct that anger at owners of companies/politicians, rather than aim it at your everyday citizen.
Comment by boogrpants 1 hour ago
Those owners and politicians are the result of exposure to American communities, schools, other institutions; they do not spontaneously exist.
Americans prop up the system as such Americans will defer or their faith was misplaced to begin with. And that ain't right; they're America! So the awfulness will continue until moral improves!
Atheist semantics while living theist like devotion to civil religion memes.
Comment by switchbak 2 hours ago
Comment by koolala 3 hours ago
Comment by TacoCommander 3 hours ago
Comment by switchbak 2 hours ago
Comment by shimman 45 minutes ago
Comment by alex_c 2 hours ago
Comment by slashdave 3 hours ago
Although, I am feeling a bit lazy so let me see if I can simulate a walk.
Comment by moomoo11 3 hours ago
Maybe they can unplug from 500+ AQI pollution and spend time with their loved ones and friends in a simulated clean world?
Imagine working for 10-12 hours a day, and you come home to a pod (and a building could house thousands of pods, paid for by the company) where you plug in and relax for a few hours. Maybe a few more decades of breakthroughs can lead to simulated sleep as well so they get a full rest.
Wake up, head to the factory to make whatever the developed world needs.
(holy fuck that is a horrible existence but you know some people would LOVE for that to be real)
Comment by trenning 1 hour ago
Comment by switchbak 2 hours ago
Except you'll never have to leave your pod. Extract the $$ from their attention all day, then sell them manufactured virtual happiness all night. It's just a more streamlined version of how many people live right now.
I'll be running away from that hellscape, thanks.
Comment by jplusequalt 2 hours ago
Comment by moomoo11 1 hour ago
They'd have their own economy and "life" and leave the rest of us alone. It would be completely transactional, so I'd have zero reason to feel bad if they do it voluntarily.
If they can be happy in a simulated world, and others can be happy in the real world, then everyone wins!
Comment by echelon 4 hours ago
I'm developing filmmaking tools with World Labs' Marble world model:
https://www.youtube.com/watch?v=wJCJYdGdpHg
https://github.com/storytold/artcraft
I think we'll eventually get to the point where these are real time and have consistent representations. I've been excited about world models since I saw the in-the-browser Pokemon demo:
https://madebyoll.in/posts/game_emulation_via_dnn/demo/
At some point, we'll have the creative Holodeck. If you've seen what single improv performers can do with AI, it's ridiculously cool. I can imagine watching entertainers in the future that summon and create entire worlds before us:
https://www.youtube.com/watch?v=MYH3FIFH55s
(If you haven't seen CodeMiko, she's an incredibly talented engineer and streamer. She develops mocap + AI streams.)
Comment by jplusequalt 2 hours ago
Just like how people in the 50s thought we would have flying cars and nuclear fusion by 2000.
Comment by adventured 4 hours ago
It's reality privilege. Most of humanity will yearn for the worlds that AI will cook up for them, customized to their whims.
Comment by jplusequalt 2 hours ago
What data/metric are you drawing from to arrive at this conclusion? How could you even realistically make such a statement?
Comment by lins1909 31 minutes ago
Comment by jacquesm 16 minutes ago
Comment by sy26 5 hours ago
Comment by observationist 5 hours ago
I think he's lucky he got out with his reputation relatively intact.
Comment by qwertyi0k 4 hours ago
Comment by YetAnotherNick 3 hours ago
Comment by qwertyi0k 3 hours ago
Comment by halfmatthalfcat 5 hours ago
Comment by observationist 5 hours ago
Comment by richard___ 4 hours ago
Comment by observationist 4 hours ago
Comment by qwertyi0k 4 hours ago
Comment by observationist 4 hours ago
Comment by mapmeld 3 hours ago
Comment by ezst 4 hours ago
When the right move (strategically, economically) is to not compete, the head of the AI division acknowledging the above and deciding to focus on the next breakthrough seems absolutely reasonable.
Comment by throw310822 48 minutes ago
Comment by acedTrex 13 minutes ago
Comment by throw310822 1 minute ago
Non-developers I know use them to organise meetings, write emails, research companies, write down and summarise counselling sessions (not the clients, the counselor), write press reports, help with advertising campaigns management, review complex commercial insurance policies, fix translations, etc. The list of uses is endless, really.
Comment by anonnon 2 hours ago
Comment by qwertox 5 hours ago
Genie looks at the video, "when this group of pixels looks like this and the user presses 'jump', I will render the group different in this way in the next frame."
Genie is an artist drawing a flipbook. To tell you what happens next, it must draw the page. If it doesn't draw it, the story doesn't exist.
JEPA is a novelist writing a summary. To tell you what happens next, it just writes "The car crashes." It doesn't need to describe what the twisted metal looks like to know the crash happened.
Comment by general_reveal 4 hours ago
Comment by slashdave 3 hours ago
Comment by phailhaus 5 hours ago
Comment by montebicyclelo 5 hours ago
I don't have access to the DeepMind demo, but from the video it looks like it takes the idea up a notch.
(I don't know the exact lineage of these ideas, but a general observation is that it's a shame that it's the norm for blog posts / indie demos to not get cited.)
[1] https://news.ycombinator.com/item?id=43798757
[2] https://madebyoll.in/posts/world_emulation_via_dnn/demo/
Comment by ollin 3 hours ago
- That forest trail world is ~5 million parameters, trained on 15 minutes of video, scoped to run on a five-year-old iPhone through a twenty-year old API (WebGL GPGPU, i.e OpenGL fragment shaders). It's the smallest '3D' world model I'm aware of.
- Genie 3 is (most likely) ~100 billion parameters trained on millions of hours of video and running across multiple TPUs. I would be shocked if it's not the largest-scale world model available to the public.
There are lots of neat intermediate-scale world models being developed as well (e.g. LingBot-World https://github.com/robbyant/lingbot-world, Waypoint 1 https://huggingface.co/blog/waypoint-1) so I expect we'll be able to play something of Genie quality locally on gaming GPUs within a year or two.
Comment by phailhaus 5 hours ago
Look at how much prompting it takes to vibe code a prototype. And they want us to think we'll be able to prompt a whole world?
Comment by whalee 2 hours ago
Problem is, that's not what we've observed to happen as these models get better. In reality there is some metaphysical coarse-grained substrate of physics/semantics/whatever[1] which these models can apparently construct for themselves in pursuit of ~whatever~ goal they're after.
The initially stated position, and your position: "trying to hallucinate an entire world is a dead-end", is a sort of maximally-pessimistic 'the universe is maximally-irreducible' claim.
The truth is much much more complicated.
Comment by post-it 1 hour ago
Comment by phailhaus 1 hour ago
Eh? Context rot is extremely well known. The longer you let the context grow, the worse LLMs perform. Many coding agents will pre-emptively compact the context or force you to start a new session altogether because of this. For Genie to create a consistent world, it needs to maintain context of everything, forever. No matter how good it gets, there will always be a limit. This is not a problem if you use a game engine and code it up instead.
Comment by seedie 4 hours ago
Comment by hmry 1 hour ago
Comment by phailhaus 2 hours ago
Comment by arionmiles 4 hours ago
LLMs can barely remember the coding style I keep asking it to stick to despite numerous prompts, stuffing that guideline into my (whatever is the newest flavour of product-specific markdown file). They keep expanding the context window to work around that problem.
If they have something for long-term learning and growth that can help AI agents, they should be leveraging it for competitive advantage.
Comment by asim 4 hours ago
Comment by jsheard 4 hours ago
This is only a useful premise if it can do any of those things accurately, as opposed to dreaming up something kinda plausible based on an amalgamation of every vaguely related YouTube video.
Comment by q3k 4 hours ago
What's the use? Current scientific models clearly showing natural disasters and how to prevent them are being ignored. Hell, ignoring scientific consensus is a fantastic political platform.
Comment by MillionOClock 4 hours ago
Comment by elfly 3 hours ago
Let's say, you simulate a long museum hallway with some vases in it. Who holds what? The basic game engine has the geometry, but once the player pushes it and moves it, it needs to inform the engine it did, and then to draw the next frame, read from the engine first, update the position in the video feed, then again feed it back to the engine.
What happens if the state diverges. Who wins? If the AI wins then...why have the engine at all?
It is possible but then who controls physics. The engine? or the AI? The AI could have a different understanding of the details of the base. What happens if the vase has water inside? who simulates that? what happens if the AI decides to break the vase? who simulates the AI.
I don't doubt that some sort of scratchpad to keep track of stuff in game would be useful, but I suspect the researchers are expecting the AI to keep track of everything in its own "head" cause that's the most flexible solution.
Comment by MillionOClock 1 hour ago
Comment by jimmar 4 hours ago
Comment by anigbrowl 3 hours ago
Comment by phailhaus 2 hours ago
Comment by godelski 4 hours ago
> Why are they not training models to help write games instead?
Genie isn't about making games... Granted, they for some reason they don't put this at the top. Classic Google, not communicating well... | It simulates physics and interactions for dynamic worlds, while its breakthrough consistency enables the simulation of any real-world scenario — from robotics and modelling animation and fiction, to exploring locations and historical settings.
The key part is simulation. That's what they are building this for. Ignore everything else.Same with Nvidia's Earth 2 and Cosmos (and a bit like Isaac). Games or VR environments are not the primary drive, the primary drive is training robots (including non-humanoids, such as Waymo) and just getting the data. It's exactly because of this that perfect physics (or let's be honest, realistic physics[0,1]). Getting 50% of the way there in simulation really does cut down the costs of development, even if we recognize that cost steepens as we approach "there". I really wish they didn't call them "world models" or more specifically didn't shove the word "physics" in there, but hey, is it really marketing if they don't claim a golden goose can not only lay actual gold eggs but also diamonds and that its honks cure cancer?
[0] Looking right does not mean it is right. Maybe it'll match your intuition or undergrad general physics classes with calculus but talk to a real physicist if you doubt me here. Even one with just an undergrad will tell you this physics is unrealistic and any one worth their salt will tell you how unintuitive physics ends up being as you get realistic, even well before approaching quantum. Go talk to the HPC folks and ask them why they need superocmputers... Sorry, physics can't be done from observation alone.
[1] Seriously, I mean look at their demo page. It really is impressive, don't get me wrong, but I can't find a single video that doesn't have major physics problems. That "A high-altitude open world featuring deformable snow terrain." looks like it is simulating Legolas[2], not a real person. The work is impressive, but it isn't anywhere near realistic https://deepmind.google/models/genie/
Comment by phailhaus 2 hours ago
Comment by dyauspitr 4 hours ago
Comment by phailhaus 2 hours ago
Comment by hn_user_9876 12 minutes ago
Comment by artisin 2 hours ago
Comment by 0xcb0 5 hours ago
If making games out of these simulations work, it't be the end for a lot of big studios, and might be the renaissance for small to one person game studios.
Comment by jsheard 5 hours ago
Comment by JeremyNT 4 hours ago
There's obviously something insanely impressive about these google experiments, and it certainly feels like there's some kind of use case for them somewhere, but I'm not sure exactly where they fit in.
Comment by falcor84 4 hours ago
Comment by jsheard 3 hours ago
Comment by nsilvestri 5 hours ago
If I am wrong, then the huge supply of fun games will completely saturate demand and be no easier for indie game devs to stand out.
Comment by bdbdbdb 5 hours ago
You COULD create a sailing sim but after ten minutes you might be walking on water, or in the bath, and it would use more power than a small ferry.
There's no way this tech can run on a PS5 or anything close to it.
Comment by WarmWash 5 hours ago
Comment by ziofill 5 hours ago
Comment by hagbarth 3 hours ago
Indie games are already bigger than ever as far as I know.
Comment by avaer 4 hours ago
I mean, if making a game eventually boils down to cooking a sufficient prompt (which to be clear, I'm not talking about text, these prompts are probably going to be more like video databases) then I'm not sure if it will be a renaissance for "one person game studios" any more than AI image generation has been a renaissance for "one person artists".
I want to be optimistic but it's hard to deny the massive distribution stranglehold that media publishing landscape has, and that has nothing to do with technology.
Comment by Avicebron 5 hours ago
Comment by neom 5 hours ago
Comment by ofrzeta 5 hours ago
Comment by saberience 5 hours ago
"Sure it can write a single function but the code is terrible when it tries to write a whole class..."
Comment by sfn42 2 hours ago
Comment by api 5 hours ago
The deadness you're talking about is there in procedural worlds too, and it stems from the fact that there's not actually much "there." Think of it as a kind of illusion or a magic trick with math. It replicates some of the macro structure of the world but the true information content is low.
Search YouTube for procedural landscape examples. Some of them are actually a lot more visually impressive than this, but without the interactivity. It's a popular topic in the demo scene too where people have made tiny demos (e.g. under 1k in size) that generate impressive scenes.
I expect to see generative AI techniques like this show up in games, though it might take a bit due to their high computational cost compared to traditional procedural generation.
Comment by pedalpete 2 hours ago
We saw a very diverse group of users, the common uses was paragliders, gliders, and pilots who wanted to view their or other peoples flights. Ultramarathons, mountain bike and some road-races where it provided an interactive way to visualize the course from any angle and distance. Transportation infrastructure to display train routes to be built. The list goes on.
Comment by Havoc 1 hour ago
Or in gaming terms do these models think FPS or RTS?
Text models and pixel grid vision models is easy but struggling to wrap my head around what world model "sees" so to speak.
Comment by nickandbro 6 hours ago
Comment by speak_on 24 minutes ago
Comment by JKCalhoun 5 hours ago
This guy a month ago for example: https://youtu.be/SGJC4Hnz3m0
Comment by cyrusradfar 9 minutes ago
Comment by speak_on 3 hours ago
Comment by mosquitobiten 5 hours ago
Comment by mikelevins 5 hours ago
The game is called "Explorers' Guild", or "xg" for short. It's easier for Claude to act as a player than a director (xg's version of a dungeon master or game master), again mainly because of permance and learning issues, but to the extent that I can help it past those issues it's also fairly good at acting as a director. It does require some pretty specific stuff in the system prompt to, for example, avoid confabulating stuff that doesn't fit the world or the scenario.
But to really build a version of xg on Claude it needs better ways to remember and improve what it has learned about playing the game, and what it has learned about a specific group of players in a specific scenario as it develops over time.
Comment by matt_LLVW 1 hour ago
Comment by ge96 5 hours ago
Comment by meetpateltech 6 hours ago
Try it in Google Labs: https://labs.google/projectgenie
(Project Genie is available to Google AI Ultra subscribers in the US 18+.)
Comment by spullara 48 minutes ago
Comment by user_hn_827 1 hour ago
Comment by binsquare 3 hours ago
I can't even fathom what it would be like for the future of simulation and physical world when it gets far more accurate and realistic.
Comment by Bjorkbat 3 hours ago
This is most evident in the way things collide.
Comment by binsquare 2 hours ago
If there is a possibility where it continue to improve at a similar rate with llms. A way to simulate fluid dynamics or structural dynamics with reasonable accuracy and speed can unlock much faster pace of innovation in the physical world. (And validated with rigorous scientific methods)
Comment by bpiche 3 hours ago
Comment by sebasv_ 1 hour ago
Comment by littlekey 59 minutes ago
I'm not certain but I think the LLM is also generating the physics itself. It's generating rules based on its training data, e.g. watch a cat walk enough and you can simulate how the cat moves in the generated "world".
Comment by bigblind 2 hours ago
Comment by throwaway314155 46 minutes ago
Comment by RivieraKid 5 hours ago
Comment by cpeth 2 hours ago
Comment by dominick-cc 2 hours ago
Comment by artur_makly 1 hour ago
Comment by almosthere 1 hour ago
Comment by srameshc 5 hours ago
Comment by in-silico 4 hours ago
The goal of world models like Genie is to be a way for AI and robots to "imagine" things. Then, they could practice tasks inside of the simulated world or reason about actions by simulating their outcome.
Comment by xyzsparetimexyz 5 hours ago
Comment by mediaman 4 hours ago
Comment by hiccuphippo 5 hours ago
Comment by saberience 5 hours ago
Comment by aurumque 5 hours ago
Comment by mikewittie 4 hours ago
Comment by educasean 5 hours ago
Comment by rvz 3 hours ago
Comment by adventured 4 hours ago
Humanity goes into the box and it never comes back out. It's better in there than it is out there for 99% of the population.
Comment by gambiting 3 hours ago
How are you justifying the enormous energy cost this toy is using, exactly?
I don't find anything "responsible" about this. And it doesn't even seem like something that has any actual use - it's literally just a toy.
Comment by anxtyinmgmt 5 hours ago
Comment by moohaad 4 hours ago
Comment by mupuff1234 2 hours ago
RIP Stadia.
Comment by WhereIsTheTruth 2 hours ago
While "journalists" were busy bootlicking a laggy 720p Android only xCloud beta, Stadia was already delivering flawless 4K@60FPS in a web browser
They killed the only platform that actually worked just to protect Microsoft
This will be a textbook case study in how a legacy monopoly kills innovation to protect its own mediocrity
Microsoft won't survive the century, they are a dinosaur on borrowed time that has already lost the war in mobile, AI, and robotics
They don't create,, they just buy marrket share to suffocate the competition and ruin every product they touch
Even their cloud dominance is about to end, as they are already losing their grip on the European market to antitrust and sovereign alternatives
Comment by JaiRathore 4 hours ago
Comment by moomoo11 3 hours ago
Comment by seanmozeik 54 minutes ago
Comment by TacoCommander 3 hours ago
Comment by rationalfaith 4 hours ago
Comment by analog8374 4 hours ago
Comment by avaer 4 hours ago
But I do think it's a partial existence proof.
Comment by analog8374 3 hours ago
Comment by cloudflare728 4 hours ago
Comment by lexandstuff 4 hours ago
Comment by HardCodedBias 4 hours ago
I mean, yes, the probability of having that level of tech in decades is quite high.
But the technology is moving very fast right now. It sounds crazy, but I think that there is a 50% chance of having ready player one level technology before 2030.
It's absolutely possible it will take more time to become economical.