Darkbloom – Private inference on idle Macs

Posted by twapi 1 day ago

Comments

Comment by kennywinker 1 day ago

I have a hard time believing their numbers. If you can pay off a mac mini in 2-4 months, and make $1-2k profit every month after that, why wouldn’t their business model just be buying mac minis?

Comment by eigengajesh 1 day ago

The numbers are optimistically legit -- it's calculated based purely considering we have demand for all machines at all times. We don't have that right now, but fairly optimistic that people will do it.

That's why we don't recommend purchasing a new machine. Existing machine is no cost for you to run this.

Electricity is one cost, but it will get paid off from every request it receives. Electricity is only deducted when you run an inference. If you have any questions, DM me @gajesh on Twitter.

Comment by mbesto 1 day ago

> That's why we don't recommend purchasing a new machine. Existing machine is no cost for you to run this.

You misunderstood. If the ROI is there, there is enough capital in existence for you to accelerate your profit. So why even deal with the complexity of renting people's hardware when you can do it yourself?

Comment by splintercell 12 hours ago

No, what he's saying is that he expects this to be the ROI in the future because his product is so good.

Comment by BuildTheRobots 21 hours ago

I don't worry about bandwidth or constant CPU use, but the one thing that will kill my mac is burning out the SSD.

The calculator gives numbers for nearly everything, but I can't obviously see how much space it needs for model storage or how many writes of temp files I should expect if I'm running flat out.

Comment by MithrilTuxedo 10 hours ago

I assumed you'd want to run one immutable model that can fit in memory without any temp files.

Comment by sleepybrett 19 hours ago

put that stuff on an external disk perhaps, it will eventually crater, but it's easier to replace than macbook internal storage (how are they doing mac minis these days?)

Comment by neurostimulant 8 hours ago

Isn't high memory pressure will result in high memory paging which will wear down the internal ssd?

Comment by stavros 1 day ago

You're not taking into account the thermal strain on the machine, though. A machine that's 100% utilized (even worse if it's in bursts) will last less than an idle machine.

Comment by washadjeffmad 1 day ago

Not appreciably, and not before a 5-yr AppleCare+ warranty expires.

Out of our >3000 currently active Apple Silicon Macs, failures due to non-physical damage are in the single digits per year. Of those, none have been from production systems with 24/7 uptime and continuous high load, which reflects your parenthetical.

Perhaps we haven't met the other end of the bathtub curve yet, but we also won't be retaining any of these very far beyond their warranty period, much less the end of their support life.

Comment by bbatsell 21 hours ago

AppleCare+ annual is perpetual as long as you keep paying it (and Apple offers to switch to that when your 3-year lump sum expires if you choose that instead). I’m guessing it ends whenever they officially discontinue hardware support, which has traditionally been about 7 years after the last unit is produced, but I haven’t reached that yet to know for sure.

Comment by alsetmusic 21 hours ago

> Not appreciably, and not before a 5-yr AppleCare+ warranty expires.

It’s three years for Macs, though I believe you can pay annually for longer. Five has never been a thing to my knowledge.

Comment by stavros 1 day ago

I think the point of this is more "use the machine you have at home" than "do a TCO analysis and see if it's profitable", though. People like to keep their machines working for longer, generally.

Comment by embedding-shape 1 day ago

> A machine that's 100% utilized (even worse if it's in bursts) will last less than an idle machine.

How much though? Say I have three Mac Minis next to each other, one that is completely idle but on, one that bursts 100% CPU every 10 minutes and one that uses 100% CPU all the time, what's the difference on how long the machines survives? Months, years or decades?

Comment by LPisGood 17 hours ago

My question is why did you have to design this to use an MDM instead of a simple program running in the terminal or something?

Comment by dmitrygr 19 hours ago

> Existing machine is no cost for you to run this.

That is not at all how modern chips work. Idle chips are mostly powered down, non-idle ones are working and that causes real measurable wear and tear on the silicon. CPU, RAM, NAND all wear and tear measurably with use on current manufacturing processes.

https://en.wikipedia.org/wiki/Electromigration

Comment by Barbing 17 hours ago

Like pitching "drive rideshares for only the cost of your time & gas"

Comment by ta988 17 hours ago

The question is, do they wear faster than they become obsolete, as in much more expensive to run than buying a new one with higher compute/watt. (and you can also factor in the ability to run latest models at usable speed)

Comment by dmitrygr 17 hours ago

It’s complicated. When you design with modern PDKs, you consider the expected duty cycle of the device, expected temps, and the wear and tear on the silicon all together. That affects the layout of the chip as well as certain choices about widths of various features. Generally, one designs consumer SoCs to last 10 years with the expected duty cycle (low). With more wear you could run out of your “years" much faster, maybe even before the warranty.

Comment by avidphantasm 1 day ago

If you start buying minis, then you need to house, power, and cool them. So you are building a mini data center. If you are building a small data center, economies of scale will drive you to want to build larger and larger. However, this gets expensive and neighbors tend to not like data centers (for good reason). To me this seems like asymmetric warfare against hyper-scalers.

Comment by xhkkffbf 23 hours ago

Yup. This way, the people pay for the air conditioning themselves and they probably don't even notice the extra cost.

Comment by edbaskerville 18 hours ago

& if they live in a cool place, they're getting a small space heater as a bonus.

Comment by dgacmu 1 day ago

No provider maintains 100% utilization of GPUs at full rate. Demand is bursty - even if this project is successful, you might expect, e.g., things to be busy during the stock market times when Claude is throwing API errors and then severely underutilized during the same times that Anthropic was offering two-for-one off peak use.

And then there's a hit for overprovisioning in general. If the network is not overprovisioned somewhat, customers won't be able to get requests handled when they want, and they'll flee. But the more overprovisioned it is, the worse it is for compute seller earnings.

I suspect an optimistic view of earnings from a platform like this would be something like 1/8 utilization on a model like Gemma 4. Their calculator estimates my m4 pro mini could earn about $24/month at 3 hours/day on that model. That seems plausible.

Comment by liuliu 21 hours ago

Hold my beer: https://imgur.com/a/sNAoghL

Comment by pgporada 20 hours ago

What's your Y axis?

Comment by psychoslave 1 day ago

Because they don’t have that much initial money in their pocket, while the idle computer is already there, and the biggest friction point is convincing people to install some software. Both producing rhetoric and software are several order of magnitude cheaper than to directly own and maintain a large fleet of hardware with high guarantee of getting the electrical stable input in a safe place to store them.

Assuming that getting large chunk of initial investment is just a formality is out of touch with 99% of people reality out there, when it’s actually the biggest friction point in any socio-economical endeavour.

Comment by chaoz_ 1 day ago

Solid q. I think the part of it is that it’s really easy to attract some “mass” (capital) of users, as there are definitely quite a few of idle Macs in the world.

Non-VC play (not required until you can raise on your own terms!) and clear differentiation.

If you want to go full-business-evaluation, I would be more worried about someone else implementing same thing with more commission (imo 95% and first to market is good enough).

Comment by jonplackett 1 day ago

I think the point they’re making though is that the numbers seem too good to be true.

ie. Does anyone know the payback time for a B100 used just for inference? I assume it’s more than a couple of months? Or is it just training that costs so much?

Comment by Saline9515 1 day ago

Eigenlayer (which this is spun off from) is a massively VC-funded crypto company.

Comment by dnnddidiej 1 day ago

It is too good to be true. When you see it is making more than a claude code subscription for fuck all work per day.

Prolly gonna make $50 a year tops.

Comment by CTDOCodebases 23 hours ago

Or like anything else it will be too good to be true at the very beginning but then once people hear about it and it gets popular supply overtakes demand and the mac minis go back to being idle most of the day.

When YouTubers start making videos about it you know it's too late.

Comment by dnnddidiej 15 hours ago

The question is what is the hassle of running it plus wear and tear. Max price will tend to that. It is not like crypto where there is capital investment in a rig that can do nothing else. People are using their existing laptop. So I reckon 20-50 a year max per laptop.

Comment by liuliu 22 hours ago

Of course these numbers are ridiculous. Mac Mini (let's assume Apple releases M5 Pro) tops Int8 (let's assume it is the same as FP8, which it is not) at ~50 TFLOPs, with Draw Things, we recently developed hybrid NAX + ANE inference, which can get you ~70 TFLOPs.

A H200 gives you ~4 PFLOPs, which is ~60x at only ~40x price (assuming you can get a Mac Mini at $1000). (Not to mention, BTW, RTX PRO 6000 is ~7x price for ~40x more FLOPs).

Your M4 Mac Mini only has ~20 TFLOPs.

Comment by bentobean 21 hours ago

> Your computer only has ~20 TFLOPs.

What a time to be alive.

Comment by znnajdla 1 day ago

The numbers are obviously high, because if this takes off then the price for inference will also drop. But I still think it’s a solid economic model that benefits low income countries the most. In Ukraine, for example, I know people who live on $200/month. A couple Mac Minis could feed a family in many places.

As a business owner, I can think of multiple reasons why a decentralized network is better for me as a business than relying on a hyperscaler inference provider. 1. No dependency on a BigTech provider who can cut me off or change prices at any time. I’m willing to pay a premium for that. 2. I get a residential IP proxy network built-in. AI scrapers pay big money for that. 3. No censorship. 4. Lower latency if inference nodes are located close to me.

Comment by kennywinker 1 day ago

How many of those people who could live off $200USD/month can afford or already have a mac mini in the house?

Comment by znnajdla 1 day ago

They already have an iPhone. They could save up or borrow for a Mac Mini if they had to. Some of those people I know who live on $200/month have $30k in the bank.

Comment by lxglv 1 day ago

then you are talking about low spend, not low income

Comment by znnajdla 1 day ago

Not really. There are lots of people who have low income and low spending, but not low savings. Retired pensioners with savings. Young families who inherited from deceased parents/grandparents. Highly paid professionals on sabbatical. I've met people from all of those categories in Ukraine who live on $200/month.

Comment by aacid 1 day ago

Isn't this same premise as "lets buy few GPUs to mine crypto and have passive income"? It didn't work very well and it probably won't work now either. If there is money to be made, bigger players will get in there, buy out all mac minis they can, drive price up for regular people and inevitably drive inference price down so you'll be lucky to get initial investment back

Comment by znnajdla 1 day ago

No it's not the same premise at all. Crypto doesn't do anything useful for legitimate businesses. AI inference is very useful for legitimate businesses, and so are residential IP proxies for scraping. And by definition, residential IPs cannot be centralized. And as building GPUs becomes more expensive, the existing pool of second hand unused hardware becomes more valuable, not less. The problem with crypto mining is that it quickly becomes unprofitable for small scale deployments. I'm not sure if AI inference would be, especially for the decentralized benefits of lower latency.

Comment by rzwitserloot 1 day ago

It is the same premise, because the person you are responding to is not talking about the moral implications at all, only about the financial / hardware implications.

Running AI inference increases the power draw, and requires certain hardware.

Mining bitcoin increases the power draw, and requires certain hardware.

OP's point thus stands: Bad players will find places to get far cheaper power than the intended audience, and will buy dedicated hardware, at which point the money you can earn to do this will soon drop below the costs for power (for folks like you and me).

Maybe that won't happen, but why won't that happen?

Comment by znnajdla 1 day ago

The main problem with crypto is there is no universal need for it. The demand for crypto doesn’t keep increasing as compute gets cheaper. But the demand for AI inference is only growing, and making it cheaper would likely only increase demand. So it’s not a race to the bottom. Sure hyper focused players can earn more at higher margins. But average players can probably still earn decently. Take for example electricity. It can still be profitable for a home in Germany to install balcony solar and make a little money selling back to the grid even though it’s obviously not as efficient as an industrial power plant. Mom and pop AI inference don’t have to be super efficient as long as they serve a universal need - it will be like balcony solar in Europe.

Comment by kennywinker 23 hours ago

The residential IP proxy point is i think invalidated by their privacy model. I think they aren’t offering up your IP, just your GPU.

Comment by NiloCK 1 day ago

On the latency point - your requests are still going through the coordinator of the system here. So on average strictly worse than a large provider.

You - Darkbloom - Operator - Darkbloom - you, vs

You - Provider - you

---

On the censorship point - this is an interesting risk surface for operators. If people are drawn my decentralized model provisioning for its lax censorship, I'm pretty sure they're using it to generate things that I don't want to be liable for.

If anything, I could imagine dumber and stricter brand-safety style censorship on operator machines.

Comment by znnajdla 1 day ago

I'm not talking about Darkbloom specifically, but rather this business model in general. I'm sure a future version of Darkbloom could be P2P for better latency. Or their central operator nodes could be geo-balanced. Liability for censorship doesn't matter if it's truly zero trust. Anyway censorship is not my main concern. Low-latency decentralized inference with no US BigTech dependency is a much bigger selling point in Europe.

Comment by yard2010 1 day ago

It's quite funny thinking about a chimpanzee seeing a lot of bananas thinking this could feed my family and then same with humans only with Mac Minis

Comment by thih9 1 day ago

> These are estimates only. We do not guarantee any specific utilization or earnings. Actual earnings depend on network demand, model popularity, your provider reputation score, and how many other providers are serving the same model.

Others are reporting low demand, eg.: https://news.ycombinator.com/item?id=47789171

Comment by p1necone 17 hours ago

Because the "ship software to people, rent their hardware" model has zero up front investment required, presumably. And they don't have to deal with power, cooling, real estate.

Comment by gleenn 1 day ago

Power and racking are difficult and expensive?

Comment by kennywinker 1 day ago

How difficult? Is running 1000 minis worth $1,000,000/month of effort? I feel like it is.

Comment by ffsm8 1 day ago

And at that scale (1k) it ain't even that hard, a single room could be enough to hazardly drop them on shelves with a big fan to draw out the heat

Comment by runako 1 day ago

There are many people who do not have ready access to a million dollars to purchase said Mac minis, much less the operating capital to rack & operate them.

Very smart play to build a platform, get scale, and prove out the software. Then either add a small network fee (this could be on money movement on/off platform), add a higher tier of service for money, and/or just use the proof points to go get access to capital and become an operator in your own pool.

Comment by nxpnsv 1 day ago

If those numbers are true, they could tart with one Mac and can double every few months. But, I guess there are also many people who do not have ready access to whatever a Mac mini costs either...

Comment by runako 1 day ago

You can run the simulation out, but if the idea works, you can get to scaled revenue much faster than organic growth keeping 100% of the margin.

This is essentially the same reason even the best money managers take outside money to start, even if they eventually kick out the investors.

Comment by agnosticmantis 1 day ago

"You could see a single robotaxi being worth, or providing, about $30,000 of gross profit per year. ... A Tesla is an appreciating asset..."

- Elon Musk during Tesla's Autonomy Day in April 2019.

Comment by foota 1 day ago

Capital and availability?

Comment by kennywinker 1 day ago

I guess if it only works at scale capital is maybe the answer. Like enough cash to buy 5 or 10 or even 100 minis seem doable - but if the idea only works well when you have 10,000 running - that makes some sense.

Comment by Filligree 1 day ago

Because their numbers don’t work out. When you do the math on token cost versus inference speed, you get something that barely breaks even even with cheap power.

Also they’ve already launched a crypto token, which is a terrible sign.

Comment by znpy 1 day ago

Being the middleman is often way more profitable

Comment by tgma 1 day ago

I installed this so you don't have to. It did feel a bit quirky and not super polished. Fails to download the image model. The audio/tts model fails to load.

In 15 minutes of serving Gemma, I got precisely zero actual inference requests, and a bunch of health checks and two attestations.

At the moment they don't have enough sustained demand to justify the earning estimates.

Comment by splittydev 1 day ago

They released this like a day ago, I'm not surprised that there's not enough demand right now. Give it some time to take off

Comment by tgma 1 day ago

You'd think to bootstrap a marketplace you'd spend your own money to feed fake requests (or perhaps allow free chat so that they induce requests).

Still, absolute zero is an unacceptable number. Had this running for more than an hour.

Comment by splittydev 1 day ago

I kind of see your point, but I also kind of don't.

Sure, it would be great if you'd immediately get hammered with hundreds of requests and start make money quickly. It would also be great if it was a bit more transparent, and you could see more stats (what counts as "idle"? Is my machine currently eligible to serve models?). But it's still very new, I'd say give it some time and let's see how it goes.

If you have it running and you get zero requests, it uses close to zero power above what your computer uses anyway. It doesn't cost you anything to have it running, and if you get requests, you make money. Seems like an easy decision to me.

Comment by usrusr 1 day ago

Bootstrapping will be near-impossible (or incredibly costly) unless they offer inference consumers models with established demand arriving at some least-cost router service where they can undercut the competition (if they actually can). And then dogfood the opportunistic provider side on their own Macs, but with a preference to putting third parties first in the queue. Everything else is just wishful thinking.

Comment by tgma 1 day ago

Well I already made the Ctrl+C decision. Yours may have been different, but I suppose only one of us installed it, and that one counts.

Comment by yard2010 1 day ago

I went with the ctrl z approach.

Comment by jagged-chisel 17 hours ago

Hopefully you also set it running in the background.

Comment by subroutine 1 day ago

Copy?

Comment by 23 hours ago

Comment by oneeyedpigeon 1 day ago

SIGINT

Comment by hamiltont 1 hour ago

> We're not taking funds from customers yet — we're personally paying for all the provider requests during this phase. Credit purchases are disabled.

This appears on their credit purchase page right now, but you have to email them to get credits (everyone starts with zero)

Comment by subroutine 1 day ago

Has anyone tested the system from the other end... sending a prompt and getting a response?

Comment by lxglv 1 day ago

weird to learn that they do not generate inference requests to their network themselves to motivate early adopters at least to host their inference software

Comment by lostmsu 1 day ago

If they paid promised > $1k/m for FLUX 2B on a Mac they would go broke in less than a month. On a single 5090 that model would provide an inference througput so high they'd have to pay close to $50k/m for the results.

The numbers are absolute fraud. You shouldn't be installing their software cause fraud could be not just about numbers.

Comment by rjmunro 1 day ago

Can you rephrase that? I don't think I've read it correctly. It sounds like you are saying it would normally cost $50k on a 5090 and they can do equivalent work paying $1k. That's sounds like a $49k profit margin, but you say they will go broke.

Comment by mhast 1 day ago

I'm assuming it's meant the other way around.

Given their estimates of a Mac being able to generate $1k (per month?) a 5090 with a lot more power would be able to generate $50k. For a $3k piece of hardware. Which is obviously not realistic. (As in, nobody is paying that much for the images, which seems to match well with no actual requests on the system.)

Comment by iepathos 19 hours ago

You can see in their stats view they have a lot of providers/nodes connected but practically no actual demand/consumers. They just launched and I'm sure get providers was top of their agenda, but it's essentially unusable as a provider unless they perform some serious lift to get actual paying customers.

Comment by thatxliner 1 day ago

and I don't think they ever will unless they're highly competitive (hopefully that price they have stays? at least for users)

I was thinking of building this exact thing a year ago but my main stopper was economics: it would never make sense for someone to use the API, thus nobody can make money off of zero demand.

I guess we just have to look at how Uber and Airbnb bootstrapped themselves. Another issue with my original idea was that it was for compute in general, when the main, best use-case, is long(er)-running software like AI training (but I guess inference is long running enough).

But there already exist software out there that lets you rent out your GPU so...

Comment by tgma 1 day ago

People underestimate how efficient cost/token is for beefy GPUs if you are able to batch. Unlikely for one off consumer unit to be able to compete long term.

Comment by starkeeper 1 day ago

What's a good place to do this?

Comment by lostmsu 1 day ago

For Windows there's https://borg.games/setup (I'm the author).

Comment by LPisGood 16 hours ago

How much revenue to users actually see?

Comment by tnchr 1 day ago

[dead]

Comment by elbac 17 hours ago

I received the same error, but it was followed by this line in the logs, which might explain the lack of inference requests assume there is actual demand...

WARN STT backend failed health check — model will NOT be advertised

Comment by gleenn 1 day ago

You have to install their MDM device management software on your computer. Basically that computer is theirs now. So don't plan on just handing over your laptop temporarily unless you don't mind some company completely owning your box. Still might be a validate use for people with slightly old laptops lying around, but beware trying to share this computer with your daily activities if you e.g. use a bank on a browser on this computer regularly. MDM means they can swap out your SSL certs level of computer access, please correct me if I'm wrong.

Comment by mirashii 1 day ago

MDMs on macOS are permissioned via AccessRights, and you can verify that their permission set is fairly minimal and does not allow what you've described here (bits 0, 4, 10).

That said, their privacy posture at the cornerstone of their claims is snake oil and has gaping holes in it, so I still wouldn't trust it, but it's worth being accurate about how exactly they're messing up.

Comment by mike_hearn 1 day ago

Edit: deleted post. I see your other post now.

You are right - the "nonce binding" the paper uses doesn't seem convincing. The missing link is that Apple's attestation doesn't bind app generated keys to a designated requirement, which would be required to create a full remote attestation.

Comment by mirashii 1 day ago

> If you can prove a public key is generated by the SEP of a machine running with all Apple's security systems enabled, then you can trivially extend that to confidential computing because the macOS security architecture allows apps to block external inspection even by the root user.

It only effectively allows this for applications that are in the set of things covered by SIP, but not for any third-party application. There's nothing that will allow you to attest that arbitrary third-party code is running some specific version without being tampered with, you can only attest that the base OS/kernel have not been tampered with. In their specific case, they attempt to patch over that by taking the hash of the binary, but you can simply patch it before it starts.

To do this properly requires a TEE to be available to third-party code for attestation. That's not a thing on macOS today.

Comment by mike_hearn 1 day ago

I wiped my post because you are right. I don't think it needs a full SGX-style TEE. What's missing is a link to designated requirements. Abusing a nonce field doesn't seem to work, or if it does I can't figure out how. The MDM/MDA infrastructure would need to be able to include:

    public key from SEP -> designated requirement of owning app binary

The macOS KeyStore infrastructure does track this which is why I thought it'd work. But the paper doesn't mention being able to get this data server side anywhere. Instead there's this nonce hack.

It's odd that the paper considers so many angles including things like RDMA over Thunderbolt, but not the binding between platform key and app key.

Reading the paper again carefully I get the feeling the author knows or believes something that isn't fully elaborated in the text. He recognizes that this linkage problem exists, proposes a solution and offers a security argument for it. I just can't understand the argument. It appears APNS plays a role (apple push notification service) and maybe this is where app binding happens but the author seems to assume a fluency in Apple infrastructure that I currently lack.

Comment by mirashii 1 day ago

I can buy the idea that if you can have the MDM infrastructure attest the code signing identity through the designated requirements, that you can probably come pretty close, but I'm still not quite sure you get there with root on macOS (and I suspect that this is part of why DCAppAttest hasn't made it to macOS yet).

Certainly, it still doesn't get you there with their current implementation, as the attempts at blocking the debugger like PT_DENY_ATTACH are runtime syscalls, so you've got a race window where you can attach still. Maybe it gets you there with hardened runtime? I'd have to think a bit harder on that.

Comment by mike_hearn 1 day ago

Yeah I didn't quite understand the need for PT_DENY_ATTACH. Hardened runtime apps that don't include get-task-allow are already protected from debugger attach from the start of the process, unless I misunderstood something.

I'm not quite sure why Apple haven't enabled DCAppAttest on macOS. From my understanding of the architecture, they have every piece needed. It's possible that they just don't trust the Mac platform enough to sign off on assertions about it, because it's a lot more open so it's harder to defend. And perhaps they feel the reputational risk isn't worth it, as people would generalize from a break of App Attest on macOS to App Attest on iOS where the money is. Hard to say.

Comment by keremimo 1 day ago

MDM is the absolute deal breaker. No way in hell will I ever make my Macbook into an unsellable brick if they so decide to lock my computer with MDM.

Even moreso, not for pennies/month

Comment by ezfe 1 day ago

The MDM profile doesn’t grant that

Comment by 1 day ago

Comment by ramoz 1 day ago

Unfortunately, verifiable privacy is not physically possible on MacBooks of today. Don't let a nice presentation fool you.

Apple Silicon has a Secure Enclave, but not a public SGX/TDX/SEV-style enclave for arbitrary code, so these claims are about OS hardening, not verifiable confidential execution.

It would be nice if it were possible. There's a lot of cool innovations possible beyond privacy.

Comment by mike_hearn 1 day ago

I wrote a whole SDK for using SGX, it's cool tech. But in theory on Apple platforms you can get a long way without it. iOS already offers this capability and it works OK.

macOS has a strong enough security architecture that something like Darkbloom would have at least some credibility if there was a way to remotely attest a Mac's boot sequence and TCC configuration combined with key-to-DR binding. The OS sandbox can keep apps properly separated if the kernel is correct and unhacked. And Apple's systems are full of mitigations and roadblocks to simple exploitation. Would it be as good as a consumer SGX enclave? Not architecturally, but the usability is higher.

Comment by znnajdla 1 day ago

As if you get privacy with the inference providers available today? I have more trust in a randomly selected machine on a decentralized network not being compromised than in a centralized provider like OpenAI pinky promising not to read your chats.

Comment by ramoz 1 day ago

Inference providers don't claim private inference. However, they must uphold certain security and legal compliances.

You have no guarantees over any random connected laptop connected across the world.

Comment by znnajdla 1 day ago

I would say the chances of OpenAI itself getting hacked and your secrets in logs getting leaked are about the same or less as the chances of a randomly selected machine on a decentralized network being reverse-engineered by a determined hacker. There's no risk-free option, every provider comes with risks. If you care about infosec you have to do frequent secret rotation anyway.

Comment by 1 day ago

Comment by geon 1 day ago

Every hardware key will be broken if there is enough incentive to do so. Their claims read like pure hubris.

Comment by znnajdla 1 day ago

Who cares about AI privacy? Most people don’t. If you do, run locally.

Comment by nl 1 day ago

They use the TEE to check that the model and code is untampered with. That's a good, valid approach and should work (I've done similar things on AWS with their TEE)

The key question here is how they avoid the outside computer being able to view the memory of the internal process:

> An in-process inference design that embeds the in- ference engine directly in a hardened process, elimi- nating all inter-process communication channels that could be observed, with optional hypervisor mem- ory isolation that extends protection from software- enforced to hardware-enforced via ARM Stage 2 page tables at zero performance cost.[1]

I was under the impression this wasn't possible if you are using the GPU. I could be misled on this though.

[1] https://github.com/Layr-Labs/d-inference/blob/master/papers/...

Comment by nitros 1 day ago

This entire paper smells of LLM, I'm sure even the most distinguished academic would refrain from using notation to prove that the SIP status cannot change during operation.

Comment by flockonus 1 day ago

While they do make this argument, realistically anyone sending their prompt/data to an external server should assume there will be some level of retention.

And more so in particular, anyone using Darkbloom with commercial intents should only really send non-sensitive data (no tokens, customer data, ...) I'd say only classification tasks, imagine generation, etc.

Comment by joelthelion 1 day ago

There's a difference between trusting Anthropic and trusting random mac owners.

Comment by mike_hearn 1 day ago

Apple Silicon systems have unified memory between CPU and GPU. The hypervisor page table trick is thus claimed to protect GPU memory from RDMA.

Comment by ramoz 1 day ago

Macs do not have an accessible hardware TEE.

Macs have secure enclaves.

Comment by nl 1 day ago

Good point!

But they argue that:

> PT_DENY_ATTACH (ptrace constant 31): Invoked at process startup before any sensitive data is loaded. Instructs the macOS kernel to permanently deny all ptracerequests against this process, including from root. This blocks lldb, dtrace, and Instruments.

> Hardened Runtime: The binary is code-signed with hardened runtime options and explicitly without the com.apple.security.get-task-allow entitlement. The kernel denies task_for_pid() and mach_vm_read()from any external process.

> System Integrity Protection (SIP): Enforces both of the above at the kernel level. With SIP enabled, root cannot circumvent Hardened Runtime protections, load unsigned kernel extensions, or modify protected sys- tem binaries. Section 5.1 proves that SIP, once verified, is immutable for the process lifetime.

gives them memory protection.

To me that is surprising.

Comment by mirashii 1 day ago

Looking at their paper at [1], there's a gaping hole: there's no actual way to verify the contents of the running binaries. The binary hash they include in their signatures is self-reported, and can be modified. That's simply game over.

[1] https://github.com/Layr-Labs/d-inference/blob/master/papers/...

Comment by mirashii 1 day ago

A note, as others have posted on this thread: I mention this as a concrete and trivial flaw in their whole strategy, but the issue is fundamental: there's no hardware enclave for third-party code available to do the type of attestation that would be necessary. Any software approach they develop will ultimately fall to that hole.

Comment by dinobones 1 day ago

Couldn't someone just uhh... patch their macOS/kernel, mock these things out, then behold, you can now access all the data?

If it's not running fully end to end in some secure enclave, then it's always just a best effort thing. Good marketing though.

Comment by mike_hearn 1 day ago

Right.

Apple is perfectly capable of doing remote attestation properly. iOS has DCAppAttest which does everything needed. Unfortunately, it's never been brought to macOS, as far as I know. Maybe this MDM hack is a back door to get RA capabilities, if so it'd certainly be intriguing, but if not as far as I know there's no way to get a Mac to cough up a cryptographic assertion that it's running a genuine macOS kernel/boot firmware/disk image/kernel args, etc.

It's a pity because there's a lot of unique and interesting apps that'd become possible if Apple did this. Darkbloom is just one example of what's possible. It'd be a huge boon to decentralization efforts if Apple activated this, and all the pipework is laid already so it's really a pity they don't go the extra mile here.

Comment by woadwarrior01 1 day ago

> iOS has DCAppAttest which does everything needed. Unfortunately, it's never been brought to macOS, as far as I know.

Apple's docs claim it's been available on macOS since macOS 11. Am I missing something here?

https://developer.apple.com/documentation/devicecheck/dcappa...

Comment by mike_hearn 23 hours ago

All lies. They mean the symbols exist and can be linked against, but

https://developer.apple.com/documentation/devicecheck/dcappa...

> If you read isSupported from an app running on a Mac device, the value is false. This includes Mac Catalyst apps, and iOS or iPadOS apps running on Apple silicon.

Comment by woadwarrior01 23 hours ago

That really sucks! TIL. So app attestation is iOS 14.0+, iPadOS 14.0+, tvOS 15.0+ and watchOS 9 only.

Comment by saagarjha 1 day ago

Yes. Running attested workloads on macOS if you are not Apple is nontrivial.

Comment by vrockdub 1 day ago

[dead]

Comment by jeroenhd 1 day ago

You can probably just tap the HTTP(S) connection to spy on the data coming through. I think it's a mistake to assume any kind of privacy for this service.

The biggest argument for remote attestation I can think of is to make sure nobody is returning random bullshit and cashing in prompt money on a massive scale.

Comment by 190n 19 hours ago

> PT_DENY_ATTACH

All you have to do is attach to the process before it does that, and then prevent this call from going through.

Comment by saagarjha 1 day ago

They quite frankly have no idea what they are talking about.

Comment by ramoz 1 day ago

I'm not arguing anything. This is how it works. There is no but.

Protection here is conditional, best-effort. There are no true guarantees, nor actual verifiability.

Comment by pants2 1 day ago

Cool idea. Just some back-of-the-envelope math here (not trusting what's on their site):

My M5 Pro can generate 130 tok/s (4 streams) on Gemma 4 26B. Darkbloom's pricing is $0.20 per Mtok output.

That's about $2.24/day or $67/mo revenue if it's fully utilized 24/7.

Now assuming 50W sustained load, that's about 36 kWh/mo, at ~$.25/kWh approx. $9/mo in costs.

Could be good for lunch money every once in a while! Around $700/yr.

Comment by mavamaarten 1 day ago

Well. Running your machine to do inference will utilize more than 50W sustained load, I'd say more than double that. Plus electricity is more expensive here (but granted, I do have solar panels). Plus don't forget to factor in that your hardware will age faster.

I'd say it's not worth it. But the idea is cool.

Comment by jorvi 1 day ago

Your hardware will age slower if you have consistent load.

Thermal stress from bursty workloads is much more of a wearing problem than electromigration. If you can consistently keep the SoC at a specific temperature, it'll last much longer.

This is also why it was very ironic that crypto miner GPUs would get sold at massive discounts. Everyone assumed that they had been ran ragged, but a proper miner would have undervolted the card and ran it at consistent utilization, meaning the card would be in better condition than a secondhand gamer GPU that would have constantly been shifting between 1% to 80% utilization, or rather, 30°C to 75°C

Comment by kennywinker 1 day ago

Their estimate is based on significantly lower consumption when under load. E.g. 25W for an M4 Pro mac mini. I have no idea if that’s realistic - but the m4s are supposedly pretty efficient (https://www.jeffgeerling.com/blog/2024/m4-mac-minis-efficien...)

Comment by kennywinker 1 day ago

Their example big earner models are FLUX.2 Klein 4B and FLUX.2 Klein 9B, which i imagine could generate a lot more tokens/s than a 26B model on your machine.

For Gemma 4 26B their math is:

single_tok/s = (307 GB/s / 4 GB) * 0.60 = 46.0 tok/s

batched_tok/s = 46.0 * 10 * 0.9 = 414.4 tok/s

tok/hr = 414.4 * 3600 = 1,492,020

revenue/hr = (1,492,020 / 1M) * $0.200000 = $0.2984

I have no idea if that is a good estimate of how much an M5 Pro can generate - but that’s what it says on their site.

They do a bit of a sneaky thing with power calculation: they subtract 12Ws of idle power, because they are assuming your machine is idling 24/7, so the only cost is the extra 18W they estimate you’ll use doing inference. Idk about you, but i do turn my machine off when i am not using it.

Comment by pants2 1 day ago

Interesting token numbers they're using, because I've benchmarked it at 69 tok/s single steam and 130 multi stream.

Comment by nnx 1 day ago

> My M5 Pro can generate 130 tok/s (4 streams) on Gemma 4 26B.

This seems high. At which quantization? Using LM Studio or something else?

Note: Darkbloom seems to run everything on Q8 MLX.

Comment by pants2 1 day ago

Ah good point, this is using Q4, benchmarked total throughout serving with Llama.cpp.

Comment by torginus 1 day ago

Also this assumes hardware never fails. I learned about this the hard way back when I started mining crypto on my 5700XT way back when.

I figured since I already used it a lot, and I've never had a GPU fail on me, it would be fine.

The fans on it died in a month of constant use, replacing them was more money than what I made on mining.

Comment by todotask2 1 day ago

OpenAI has only about 5% paying customers, how does it generate revenue?

I don’t think this is a sustainable business model. For example, Cubbit tried to build decentralised storage, but I backed out because better alternatives now exist, and hardware continues to improve and become cheaper over time.

Your electricity and ownership are going to get lower return and does not actually requce CO2.

Comment by chaoz_ 1 day ago

Genuinely curious, is there any way to estimate amortization of Mac?

I’d imagine 1 year of heavy usage would somehow affect its quality.

Comment by pants2 1 day ago

Yeah, only way to get there is assuming they're not giving prompt caching discounts while my laptop is getting prompt caching benefits, with very many large prompts. So yes I am skeptical of their numbers.

Comment by xendo 1 day ago

Any idea what makes for such a diff between your and theirs numbers? Batching? Or could they do a crazy prefix caching across all nodes to reduce the actual processing.

Comment by znnajdla 1 day ago

Maybe lunch money for you, but there are people in some parts of the world who live on $200/month. Like Ukraine.

Comment by sethherr 1 day ago

But they probably don’t have M5 MacBook pros idling

Comment by tonyedgecombe 1 day ago

Or reliable energy or internet.

Comment by znnajdla 1 day ago

They can acquire one if it offers real opportunities like this.

Comment by MrDrMcCoy 1 day ago

Don't forget to factor in cooling costs.

Comment by pants2 1 day ago

Or saved heating costs in the winter!

Comment by dgacmu 1 day ago

@eigengajesh - Your cost estimator lists Mac Mini M4 Pro with only 24 or 48GB options, but the M4 Pro mini can also be configured with 64GB. At least, I hope so, as I'm typing this on one. ;-)

Oh, also, you seem to have some bugs:

Gemma: WARN [vllm_mlx] RuntimeError: Failed to load the default metallib. This library is using language version 4.0 which is not supported on this OS. library not found library not found library not found

cohere: 2026-04-16T14:25:10.541562Z WARN [stt] File "/Users/dga/.darkbloom/bin/stt_server.py", line 332, in load_model 2026-04-16T14:25:10.541614Z WARN [stt] from mlx_audio.stt.models.cohere_asr import audio as audio_mod 2026-04-16T14:25:10.541643Z WARN [stt] ModuleNotFoundError: No module named 'mlx_audio.stt.models.cohere_asr'

Trying to download the flux image models fails with:

curl: (56) The requested URL returned error: 404

darkbloom earnings does not work

your documentation is inconstent between saying 100% of revenue to providers vs 95%

I think .. this needs a little more care and feeding before you open it up widely. :) And maybe lay off the LLM generated text before it gets you in trouble for promising things you're not delivering.

Comment by canarias_mate 20 hours ago

[dead]

Comment by haspok 1 day ago

Having strong SETI@Home vibes from 25 years ago, except of course, this is not for the greater good of humanity, but a for-profit project.

Problem is, from a technical point of view, what kind of made sense back then (most people running desktops, fans always on, energy saving minimal) is kind of stupid today (even if your laptop has no fan, would you want it to be always generating heat?)...

I definitely want my laptops to be cool, quiet and idle most of the time.

Comment by kamranjon 1 day ago

My m4 max mbp with 128gb of memory is constantly training 24/7 on weekends- it’s why I bought the thing.

Comment by vorticalbox 1 day ago

I some times play about with local models via ollama/comfyui and more recently ace-step to generate music.

This is short bursts of heat 5-10 m during the render I would not be happy with that for multiple hours a day. I am sure that would have a negative effect on battery health.

Comment by dchuk 23 hours ago

Interesting concept. Two sided marketplaces are hard to bootstrap but maybe just enough curiosity would get the flywheel going. Hell they should just try and convince people to enroll as providers but then also use the service even if it’s hitting their own machines until there’s some degree of supply and demand pressure then try and get only providers to sign up. Or set up some way to encourage providers to promote others to use the service (the 100% rev share kind of breaks that concept but anything can change).

I wish this was self hostable, even for a license fee. Many businesses have fleets of Macs, sometimes even in stock as returned equipment from employees. Would allow for a distributed internal inference network, which has appeal for many orgs who value or require privacy.

Comment by TuringNYC 1 day ago

I'd love a way to do this locally -- pool all the PCs in our own office for in-office pools of compute. Any suggestions from anyone? We currently run ollama but manually manage the pools

Comment by damezumari 1 day ago

https://github.com/exo-explore/exo

Comment by zozbot234 1 day ago

If you set CPUSchedulingPolicy=idle Nice=19 IOSchedulingClass=idle in the ollama server configuration it should run in the background with lowest priority.

Comment by utopiah 1 day ago

Seems like so much more work than "just" paying for https://huggingface.co or whichever other neocloud who already did all the setup for you and just waits for your credit card per minute/seconds/token.

Comment by TuringNYC 23 hours ago

It is much more work because for many workloads you have geographic ringfencing and cannot send it out to the cloud

Comment by utopiah 7 hours ago

Doubt this kind of workloads would agree to send data then to a cloud of randos devices, precisely when cloud providers to certify they aren't looking at clients data (Customer-managed encryption keys, CMEK).

Comment by frankfrank13 22 hours ago

This is one of those ideas I think makes perfect sense, but requires so much operational change for the entire stack, that it would be very difficult to scale:

- Convincing labs to run distributed, burst-y inference

- Convincing people to run their Mac all day, hoping to make a little profit

- Convincing users to trust a distributed network of un-trusted devices

I had a similar idea, pre-AI, just for compute in general. But solving even 1 of those 3 (swap AI lab for managed-compute-type-company, eg Supabase, Vercel) is nearly impossible.

Comment by stuxnet79 1 day ago

So basically ... Pied Piper.

Comment by JaggerJo 1 day ago

finally!

Comment by MyUltiDev 20 hours ago

The hardware-attested privacy path is the interesting part of this, but the economic side has a quieter risk the thread has not named: the load tax per request. MiniMax M2.5 239B from your catalog still has to load all 239B weights even though only 11B are active — that is roughly 120GB at Q4_K_M, and cold load from SSD on Apple Silicon is measurable in tens of seconds. Even the Qwen3.5 122B MoE lands around 65GB cold. If the coordinator routes request number two to a different idle Mac than request number one, or if the owner's machine spun the model out to free memory in between, each request pays that cold load before the first token. Keeping the model resident 24/7 solves the latency but eats into the power budget the operator is trying to amortize in the first place. How does the coordinator decide which provider to keep warm for which model? A 16GB or 32GB home Mac cannot host Qwen3.5 122B MoE at all, and the Mac Studios that can are a much smaller slice of the 100M machine estimate.

Comment by pants2 1 day ago

You might not even know it as a user but the payment/distribution here is all built on crypto+stablecoins. This is a great use case for it.

Comment by rvz 1 day ago

Good. Another great non-speculative use-case for crypto and stablecoins.

Comment by kennywinker 1 day ago

Amazing! Let me see, doing the math r/n… carry the one, yup that makes the total number of non-speculative uses for crypto and stablecoin: 1

Comment by BingBingBap 1 day ago

Generate images requested by randoms on the internet on your hardware.

What could possibly go wrong?

Comment by MicBook56 1 day ago

I like the idea but it wont take off until Homomorphic Encryption for inference becomes a thing that's efficient and anyone can be a node.

Comment by jzig 21 hours ago

[fix: remove hardcoded API_KEYS ](https://github.com/Layr-Labs/d-inference/pull/39/changes)

Comment by creamyhorror 21 hours ago

oh boooy, it's a benchmarking script, but still...

Comment by NiloCK 1 day ago

Interesting to see an offering with this heritage [1] proposing flat earnings rates for inference operators here, rather than trying to sell a dynamic marketplace where operators compete on price in real-time.

Right now the dashboards show 78 providers online, but someone in-thread here said that they spun one up and got no requests. Surely someone would be willing to beat the posted rate and swallow up the demand?

I expect this is a migration target, but a tactical omission from V1 comms both for legitimate legibility reasons (I can sell x for y is easier to parse than 'I can participate in a marketplace') and slightly illegitimate legibility reasons (obscuring likely future price collapse).

Still - neat project that I hope does well.

[1] Layer Labs, formerly EigenLayer, is company built around a protocol to abstract and recycle economic security guarantees from Ethereum proof of stake.

Comment by Inferlane 1 day ago

[dead]

Comment by woadwarrior01 1 day ago

I won't install some random untrusted binary off of some website. I downloaded it and did some cursory analysis instead.

Got the latest v0.3.8 version from the list here: https://api.darkbloom.dev/v1/releases/latest

Three binaries and a Python file: darkbloom (Rust)

eigeninference-enclave (Swift)

ffmpeg (from Homebrew, lol)

stt_server.py (a simple FastAPI speech-to-text server using mlx_audio).

The good parts: All three binaries are signed with a valid Apple Developer ID and have Hardened runtime enabled.

Bad parts: Binaries aren't notarized. Enrolls the device for remote MDM using micromdm. Downloads and installs a complete Python runtime from Cloudflare R2 (Supply chain risk). PT_DENY_ATTACH to make debugging harder. Collects device serial numbers.

TL;DR: No, not touching that.

Comment by amdivia 1 day ago

Until we have breakthroughs in homomorphic encryption compute, I won't trust such privacy claims

Comment by jdironman 16 hours ago

Reminds me a lot of when I used to deploy this (DataseamGrid) on K12 computers. I was actually just discussing this scenario with a friend.

https://www.dataseam.org/research/

Comment by poorman 20 hours ago

As one of the only people running a Mac Studio M3 Ultra with 512 GB of RAM on the network, I can tell you at sustained 100% GPU utilization I am measuring 250 watts max (at the power outlet). My solar panels are easily producing this. The power calculation goes away once you connect a solar panel. You can get a 400 watt solar panel on Amazon for $300.

Comment by qurren 20 hours ago

> You can get a 400 watt solar panel on Amazon for $300.

Too expensive. It's probably producing 200 watts average for 8 hours a day. That's 1600 watt hours, which is about $1.60 at PG&E prices. That would take 187 days to recoup the cost of just the panel.

If you include installation costs and "what PG&E steals if you wire it to the same grid" it's probably more like 4x that, which is too long.

Tell me when we can have 400 watt solar panels for $50. Stupid capitalism literally forces solar panel prices to make it unprofitable.

People should never have to take out loans for solar. Solar should be subsidized and forced by the government to be so cheap that it repays for its cost within a month. Then we're talking. Most things I buy to save money, I expect them to repay within a month. Maybe 2 months max.

Comment by dgacmu 19 hours ago

You can get them used for that price, and new for $107 if you buy qty 10+. See signature solar as one example.

Installation costs and inverters not included, however.

Comment by poorman 20 hours ago

If you are worried about a $300 solar panel you are not going to like the cost of a Mac Studio M3 Ultra 512 GB! haha

Comment by qurren 20 hours ago

I'm not "worried" about that cost, I would rather just pay PG&E electricity if the solar panel cost $300.

Just pointing out why capitalism + solar is a failure. Capitalism reprices the good thing to be equally expensive to the bad thing, so that nobody buys the good thing anymore.

Comment by icedrift 17 hours ago

Are you actually seeing demand?

Comment by 0xbadcafebee 1 day ago

I'm not sure how the economics works out. Pricing for AI inference is based on supply/demand/scarcity. If your hardware is scarce, that means low supply; combine with high demand, it's now valuable. But what happens if you enable every spare Mac on the planet to join the game? Now your supply is high, which means now it's less valuable. So if this becomes really popular, you don't make much money. But if it doesn't become somewhat popular, you don't get any requests, and don't make money. The only way they could ensure a good return would be to first make it popular, then artificially lower the number of hosts.

Comment by dr_kiszonka 1 day ago

"These are estimates only. We do not guarantee any specific utilization or earnings. Actual earnings depend on network demand, model popularity, your provider reputation score, and how many other providers are serving the same model.

When your Mac is idle (no inference requests), it consumes minimal power — you don't lose significant money waiting for requests. The electricity costs shown only apply during active inference.

Text models typically see the highest and most consistent demand. Image generation and transcription requests are bursty — high volume during peaks, quiet otherwise."

Comment by alexpotato 1 day ago

Wasn't there an idea about 15 years ago where you would open your browser, go to a webpage and that page would have a JavaScript based client that would run distributed workloads?

I believe the idea was that people could submit big workloads, the server would slice them up and then have the clients download and run a small slice. You as the computer owner would then get some payout.

Intersting to see this coming back again.

Comment by willquack 16 hours ago

I used to work at Distributive (formerly "Kings Distributed Systems") on its DCP compute platform" which is entirely what you're describing. You can deploy a JS/WASM based workload, and it will be "sliced" and served to browser-based compute nodes. With WebGPU you can sort of have inference executing in the browser too. Incredible people there with an awesome project

I added Python execution support via Pyodide (cpython compiled to wasm) and worked on a bunch of other random stuff like WebLLM inferencing during my time there.

Apart from Distributive, there's also the "Golem network", "Salad", "Koii" and various other similar projects.

---

I'm not sure if I'm convinced by the "Uber for compute" use case with compute buyers and compute workers (sellers), but if you're a university and you have 1000 Windows machines across all your computer labs, it'd be nice to leverage that compute for running research or something idk - especially with the price of ram / cloud offerings these days...

Comment by thekid314 1 day ago

Or SETI which would search for signs of alien life.

Comment by Jn2G3Np8 1 day ago

Love the concept, with some similarity to folding@home, though more personal gain.

But trying it out it still needs work, I couldn't download a model successfully (and their list of nodes at https://console.darkbloom.dev/providers suggests this is typical).

And as a cursory user, it took me some digging to find out that to cash out you need a Solana address (providers > earnings).

Comment by dkroy 19 hours ago

Cool idea, though hats off to anyone who got cohere-transcribe to show up as serving the model. I could get device to show up, but kept having issues getting their server to properly serve the model though it could just be the device I tested.

Comment by poorman 19 hours ago

Yeah I think there's a dependency issue going on there. Something isn't installed that needs to be.

Comment by heddycrow 1 day ago

I think it’s important that systems like this exist, but getting them off the ground is non-trivial.

We’ve been building something similar for image/video models for the past few months, and it’s made me think distribution might be the real bottleneck.

It’s proving difficult to get enough early usage to reach the point where the system becomes more interesting on its own.

Curious how others have approached that bootstrap problem. Thanks in advance.

Comment by jaffee 1 day ago

client side of this kind of needs to be open source unless I'm running it on a dedicated machine and firewalling it from the rest of my network. Or the company needs to have a very strong reputation and certifications. curlbash and go is a pretty hard sell for me

Comment by utkarsh_apoorva 1 day ago

Like the concept. This is not a business - should be an open source GitHub repo maybe.

They lost me with just one microcopy - “start earning”. Huge red signal.

Comment by Hamuko 1 day ago

But why would I donate my Mac Studio's idle time if I couldn't "start earning"?

Comment by miki123211 1 day ago

> Operators cannot observe inference data.

Is there some actual cryptography behind this, or just fundamentally-breakable DRM and vibes?

Comment by auslegung 22 hours ago

How can one do this safely? If I create a new, non-sudo user, can I install the MDM profile only for that user? I don't understand how this all works obviously so maybe this is a very dumb question

Comment by WatchDog 1 day ago

I installed two models, but it just always reports:

    Available models (2):
    CohereLabs/cohere-transcribe-03-2026 (4.6 GB)
    flux_2_klein_9b_q8p.ckpt (20.2 GB)
    ...
    Advertising 0 model(s) (only loaded models)

Also the benchmark just doesn't work.

Interesting idea, but needs some work.

Comment by eigengajesh 1 day ago

hey guys! i'm the creator. let me know if you have any questions.

Comment by daniel_iversen 23 hours ago

Hi! Others have noted this too but you can't seem to buy credits right now, it says "This page couldn’t load" in a custom error page when you've selected the amount and click continue to checkout. Congrats on launching such a cool project that's getting people excited, thinking and discussing :)

Comment by 0xc133 1 day ago

Hey Gajesh! I sent you an email with some of the teething problems I ran into trying to get started as a provider. Hope it didn't end up in your spam folder!

Comment by SlavikCA 23 hours ago

Please offer new clients try it: at least let us to send few requests in the chat.

Comment by gndp 1 day ago

They are almost claiming FHE, isn't it just a matter of creating the right tool to get the generated tokens from RAM before it gets encrypted for transfer. How is it fundamentally different than chutes?

Comment by podviaznikov 1 day ago

I've tried to install it on my mac, but not sure what macOS version it should support.

on 15.1 it failed to serve models.

updated to latest 15.5 and it fails to run binary.

Comment by Schiendelman 1 day ago

I think macOS has jumped to 26, right?

Comment by jboggan 1 day ago

Is this named after the 2011 split album with Grimes and d'Eon?

Comment by drob518 1 day ago

Seems like an interesting way for those people that purchased a Mac Mini to run OpenClaw to pay off the hardware, since mostly it’s now idle.

Comment by v9v 1 day ago

They could consider registering as a provider on something like OpenRouter if they aren't getting enough inference requests on their own site.

Comment by smooth968 23 hours ago

[dead]

Comment by AustinDev 17 hours ago

I'm unable to download FLUX.2 models from `darkbloom models`

Comment by puttycat 1 day ago

> Every request is end-to-end encrypted

Afaik you will need to decrypt the data the moment it needs to be fed into the model.

How do they do this then?

Comment by mr_mitm 1 day ago

The system hosting the model must be one of the ends.

Remember, all encryption is E2EE if you're not picky about the ends.

Comment by czk 21 hours ago

the MDM profile requirement is suspect though I get why they are doing it. but it doesn't inspire confidence to see that their profile is unsigned and still using the default micromdn scep challenge...

Comment by ponyous 1 day ago

Why does M1 Max project significantly higher revenue than M3 Max with double the ram?

Comment by dangoodmanUT 1 day ago

This feels like defi... de-ai

Comment by ripped_britches 1 day ago

How does the inference work correctly if the payloads are encrypted?

Comment by koliber 1 day ago

Apple should build this, and start giving away free Macs subsidized by idle usage.

Comment by chaoz_ 1 day ago

That solution actually makes great sense. So Apple won in some strange way again?

Guess there are limitations on size of the models, but if top-tier models will getting democratized I don’t see a reason not to use this API. The only thing that comes to me is data privacy concerns.

I think batch-evals for non-sensitive data has great PMF here.

Comment by 59nadir 1 day ago

> So Apple won in some strange way again?

Heh, what did they win exactly? This is just a way for another company to extract value out of the single region of the world where Apple is a relevant vendor, and it happens to be the one where it's the easiest to pull people into schemes.

Comment by rvz 1 day ago

Yes. They never needed to participate in the AI race to zero.

Because they were already at the finish line with Apple Silicon.

> I don’t see a reason not to use this API. The only thing that comes to me is data privacy concerns.

The whole inference is end-to-end encrypted so none of the nodes can see the prompts or the messages.

Comment by chaoz_ 1 day ago

Fun question: can some (part of it) be a crypto token that I can buy? :))

That would finally be a crypto thing which is backed by value I believe in.

Comment by resonanormal 1 day ago

I could imagine this working for the openclaw community if the price is right

Comment by Havoc 1 day ago

That was my first thought too especially for talks that aren’t particularly important like daily digests of online things

Comment by throwatdem12311 1 day ago

Actually more useful than Bitcoin. Brilliant idea.

Comment by 1 day ago

Comment by bentt 1 day ago

I thought this was Apple’s plan all along. How is this not already their thing?

Comment by DeathArrow 1 day ago

Why only Macs? If we think of all PCs and mobile phones running idle, the potential is much larger.

Comment by btown 1 day ago

From the paper: https://github.com/Layr-Labs/d-inference/blob/master/papers/...

> Apple’s attestation servers will only generate the FreshnessCode for a genuine device that checks in via APNs. A software-only adversary cannot forge the MDA certificate chain (Assumption 3). Com- bined with SIP enforcement (preventing binary replace- ment) and Secure Boot (preventing bootloader tampering), this provides strong evidence that the signing key resides in genuine Apple hardware.

Comment by saagarjha 1 day ago

I am not entirely sure they understand that System Integrity Protection and Secure Boot can be turned off.

Comment by 1 day ago

Comment by btown 1 day ago

My understanding from the paper is that doing so should cause certain things in Apple's hardware security enclaves to break a signing chain, and a server-side MDM system integrated with Apple servers can detect this. But I'm not familiar with the underlying technology, so not sure if underlying assumptions are incorrect.

Comment by saagarjha 1 hour ago

AFAIK that just ensures the SEP is present but perhaps they are signing the boot state now

Comment by nl 1 day ago

They use the Apple TEE which they claim also protects GPU memory (I wasn't aware of this).

NVidia data center GPUs have a similar path, but not their consumer ones. Not sure about the NVidia Spark.

It's possible AMD Strix Halo can do this, but unlikely for any other PC based GPU environments.

Comment by MrDrMcCoy 1 day ago

Epyc has that VM encrypted memory thing, which comes pretty close. It does raise an interesting question, though: would a PCIe card passed through to a VM be able to DMA access the memory of neighboring devices?

Comment by stryakr 1 day ago

simple first target, PCs have more variability

Comment by subpixel 1 day ago

Why isn’t a MacBook Air M5 on the hardware list?

Comment by ianpurton 1 day ago

Because the model that generated that list was trained before the M5 came out.

Comment by chakintosh 1 day ago

no fans

Comment by grvbck 1 day ago

Broken calculator or am I missing something here?

  Macbook Air M2  8GB   12h/day -> $647/month

  Mac Mini M4     32GB  12h/day -> $290/month

I mean, I'd be happy to buy a few used M2 Airs with minimal specs and start printing money but…

Comment by bprasanna 1 day ago

Like Fold@home but for profit!

Comment by autodidacticon 18 hours ago

bittensor has something to say about this

Comment by egorfine 1 day ago

I really want this to succeed

Comment by dcreater 1 day ago

I cant buy credits - says page could not load

Comment by 16 hours ago

Comment by logicallee 23 hours ago

It's a good project that makes sense. I recommend adding a contractual layer as well, since it's free and makes sense. Operators could legally sign that they will not look into the inference layer. After all, the operators already have a financial relationship with this provider, so it makes sense to add a contract to it and keep operators from looking into other people's data that way, too. I wish this project a lot of success.

Comment by Fokamul 1 day ago

Thanks, if this takse off. I have finally some motivation to do exploitation in kernel. :)

Comment by rvz 1 day ago

Should have called it “Inferanet” with this idea.

Away this looks like a great idea and might have a chance at solving the economic issue with running nodes for cheap inference and getting paid for it.

Comment by sharts 1 day ago

Too much to read.

Comment by jaylane 1 day ago

latest (v0.3.8) tar doesn't contain image-bank or gRPCServerCLI dependencies so installer fails.

Comment by jonhohle 1 day ago

> That is not a technology problem. It is a marketplace problem.

I cringe every time I see this sentence structure. I know the joke is about emdashes, but the “Its not …. It’s ….” drives me crazy.

Comment by sergiusignacius 1 day ago

I see it everywhere now, it's even in videos and how people talk.

Comment by rustyhancock 23 hours ago

Maybe it's a Baader-Meinhoff phenomena and even a small difference in frequency feels overwhelming because you can't help but notice when it happens without noticing when it doesn't occur.

Comment by parasubvert 1 day ago

It learned it from humans....

Comment by projektfu 1 day ago

In a little while, AI will reverse the order and everyone will be happy.

Tired: That is not a technology problem. It is a marketplace problem.

Wired: This is a marketplace problem, not a technology problem.

Comment by rustyhancock 23 hours ago

I think what irritates people is that there isn't a natural variation to whatever pattern it uses.

And I don't really know why both patterns don't appear equally.

Intuitively I just can't grasp how it can be so selective with the phrases it generates.

Comment by nnevatie 1 day ago

Only an em dash missing in between to be chefs-kiss-perfect.

Comment by biztos 1 day ago

It's not a sentence-structure problem. It's an effort problem.

But the real solution is to do this other thing. If you'd like I can give you the three-step guide to fixing the thing you asked me to fix.