
The iPhone Is Still Not Enough
A French production photographer posted a quiet correction on X this week, and it captures the most important conceptual unlock in the new 3D capture economy. The iPhone is not the rig. The iPhone is a component of a rig. Here's what that distinction costs you if you miss it.
A French production photographer named Aurélien Camart posted a thread this week that started a small fight on X. The thread was titled, in a tone we'd describe as gently exasperated, "Made with just an iPhone — actually no, here's the real gear used."
The image attached: his actual capture rig. An XGRIDS handheld 3D scanner. An Insta360 360-degree camera. A DJI Avata FPV drone. A RØDE long boom pole. Two carbon-fiber extension poles. A small kit bag of accessories. He'd been getting "made with just an iPhone!" comments on his production 3D scans, and the thread was a quiet correction.
It's the most important conceptual unlock in the entire new 3D capture economy, and the marketing is actively obscuring it.
The iPhone is not the rig. The iPhone is a component of a rig.

The single-iPhone reality
A single iPhone, in 2026, is the best handheld camera ever sold. It is also the best handheld camera ever sold for a single point of view at a single moment in time. The new 3D capture economy requires many points of view at the same moment in time, and "a single iPhone" cannot supply that.
There are two ways around this.
The first is to capture over time — walk around the subject for two minutes, recording video, and let the software figure out the camera positions afterward. (This is the Structure from Motion step that anchors most reality-capture pipelines.) It works for static subjects. A parked car, a building, a piece of equipment that doesn't move. It does not work for anything in motion.
The second is to use many iPhones at once, synchronized. That works for moving subjects. A golf swing. A person walking. A liquid pouring. But it requires many iPhones plus a synchronization layer (5G, gigabit Ethernet, or some other low-latency network) plus a triggering system plus a piece of software that turns dozens of simultaneous video files into one 4D Gaussian Splat.
The iPhone is not the rig. The iPhone is a component of a rig. The rig is the network, the trigger, the on-device processing app, and the cloud pipeline that turns terabytes of synchronized video into a single navigable 3D moment.
Why "just an iPhone" is the wrong question
Aurélien's frustration in his thread is the same frustration every working production person has: the people asking "did you really shoot that with just an iPhone?" are asking the wrong question.
The right question is: what's the system the iPhone is a component of?
For Aurélien, the system is his XGRIDS handheld 3D scanner doing the heavy geometric work, his Insta360 capturing context, his Avata drone capturing angles he can't reach on foot, his boom and poles getting the camera into positions a tripod can't, and his desktop machine running the splat training on a serious GPU after the capture. The iPhone, if there's one in the bag at all, captures the supplemental footage. It's the least important camera in the kit.
For our own pipeline, we use no iPhones — we use a 360-degree camera on a pole because we capture stationary subjects from a moving camera. Different problem, different tool. We'll publish a full pipeline write-up next week.
The pattern: in every working 3D capture system, the iPhone is one component. The system is the rig, the network, the workflow, and the pipeline. Marketing copy compresses all of that into "shot on iPhone." Anybody quoting work based on that compression is going to be unpleasantly surprised by what their actual rig costs.

What this means if you're a business owner buying 3D scanning
A few honest filters when evaluating any 3D scanning vendor or service:
Ask what the rig is, not what the camera is. "Shot on iPhone" is meaningless. "Shot with a 360-degree camera on a 12-foot extension pole in four orbits at three heights" is meaningful. "Shot with a multi-camera synchronized array networked over private 5G" is meaningful. Vendors who can't answer this question precisely are quoting hobby work.
Ask what happens to the data after the capture. "We process it in the cloud" is the marketing answer. "We process it through Structure from Motion in RealityScan, then train a Gaussian Splat in LichtFeld Studio, then publish through PlayCanvas's SuperSplat" is the working answer. Vendors who can't name their stack are reselling someone else's.
Ask about turnaround time and where the bottleneck is. A 24-hour turnaround that requires a $20,000 cloud bill per job is different from a 48-hour turnaround that runs on an in-house GPU. Both can be quoted at the same price. Only one of them survives volume.
Ask whether the scan is measurable. If you're paying for a 3D scan and you can't get dimensions out of it, you're paying for a 3D photograph. There's nothing wrong with that, but the price should reflect it. (We'll have more on the measurement layer in a follow-up post later this month.)
Ask whether the scan is navigable. If you're paying for a 3D scan and you can't walk through it, you're paying for a 3D photograph. Same caveat.
A 3D scan can be a photograph (you orbit it), a measurement instrument (you measure it), a navigable environment (you walk through it), or any combination of those. The price differential between "just a photograph" and "all three" is mostly software configuration — about 10% of the total work. Vendors who quote the same price for all three are doing one and pretending to do three. Vendors who quote different prices for each tier are doing real work and showing it.

The takeaway
The 3D capture economy is real. The deliverables it produces are real. The price points are increasingly accessible.
But the marketing is running ahead of the reality, and the reality is that every working production rig is a system, not a camera. The iPhone is a component of some systems and absent from others. There is no "made with just an iPhone" working production setup, and the people selling you one are either misleading you about their actual rig or producing hobby-grade output and charging production-grade prices.
If you're a business owner thinking about 3D scanning for your operation, here's the short version: find a vendor who can describe their rig, their pipeline, and their failure modes specifically. The rest is theater.
If you'd like to talk to one, get in touch. We can describe our rig, our pipeline, and our failure modes specifically. Several follow-up pieces this week and next will walk through the full deliverable stack — the pipeline itself, the navigation layer, the measurement layer, and how the new 3D capture economy is being priced in production today.
Related reading:
- Cybertruck to Walkable 3D Model in Two Hours: What We Just Shipped — our own ground-up rig, pipeline, and failure modes
- Walkable Splats: 418 KB Makes Any 3D Scene a Video Game — the navigation layer that turns a 3D photograph into a deliverable
- AprilTag-Scaled Gaussian Splatting: Photo-Realistic 3D With Real-World Dimensions — the measurement layer for industrial and engineering use cases
- Microsoft TRELLIS.2: Why We're Not Buying the Viral Framing — how to read viral AI release posts so your business doesn't budget on them
Keep reading
More from the field.
Stay Connected
Get practical insights on using AI and automation to grow your business. No fluff.