We use cookies to improve your browsing experience on our site, show personalized content and analyze site traffic. By choosing "I Accept", you consent to our use of cookies and other tracking technologies.

To the top
Zengo - Augmented reality in the Apple ecosystem

Augmented reality in the Apple ecosystem

Zengo - óra10 minutes reading time
2022. 12. 08.

Augmented Reality is one of the biggest tech trends nowadays. Its growth does not seem to be slowing down anytime soon, as smart phones and other AR capable devices are becoming more accessible worldwide.

What is AR?

Augmented Reality (AR) is the enhanced version of the real physical world around us with digital visual objects. It aims at influencing your experience and senses by adding a layer of video, infographics, pictures, sound and other details over the actual physical world.

In other words, by looking at the world through our device’s camera, we can then add digital changes and also add other fully virtual things to our scene. This is where the term “augmented reality” comes from. We can find this type of tech used in many ways today, including Snapchat lenses and different shopping apps where with the help of AR, we can virtually try the clothes on.

The most famous application which uses AR technology is probably Pokemon Go, which was released in 2016 and became a global sensation in no time. The ways in which augmented reality can be used is not limited to only catching Pokemons, there are several other useful apps using this technology:

  • improved navigation systems, which place the route digitally on the actual view of the road on the windshield,
  • during football matches, broadcasters draw lines on the football pitch using AR in order to illustrate and analyze the game,
  • interior design applications with which we can place virtual furniture in our home,
  • at historical sites, like in Pompeii, we can take a glimpse into ancient civilizations by projecting them over the ruins that remain, making the past come alive.

100% Credit: Niantic

AR vs VR

People often confuse augmented reality and virtual reality, so let’s see how the two differ from each other. Augmented reality enhances and adds objects to our physical world around us,it is a new digital layer we can only see through our AR capable device. It puts the virtual layer within a frame that our device provides. Compared to this, virtual reality is a completely computer generated and animated digital environment. The users can completely immerse themselves in this virtual space in animated scenes and fully digital environments. Virtual reality can also be used to capture a real physical place and embed that virtually reconstructed version of it in a VR application. With a Virtual headset anyone can walk on the other side of the world as if they were actually there.

In short, VR replaces the environment around us with something else and AR enhances it with virtual objects.

Not only in the way they work, but also the devices used for AR and VR are different. VR uses headsets which fit on the user’s head and display simulated visual and auditory information. AR devices are simpler than this, usually smartphones, glasses, projectors and the so called HUDs (head-on display) in cars.

Augmented reality is available for anyone with a smartphone, thus compared to VR it is more efficient in brand building and as a gaming tool. In addition to this, the higher price of VR systems make regular and everyday use harder. VR can also have effects on the user’s health too - this technology needs more time to develop so inexperienced users are not faced with unpleasant side effects like blurry sight, headache or nausea.


While early virtual reality systems took off in 1950-60s, VR training concepts used by the military started gaining traction at the beginning of the 1980s. The first attempt to release a mainstream VR-headset was made by Sega in 1993. The Sega VR was meant to be an accessory to the Sega Genesis game console, though in the end it never ended up hitting the shelves. The first commercially successful VR-headset was the Oculus Rift, which was released in 2010. Today these kind of devices are still expensive, and are mainly targeted towards gamers.

Augmented reality branched off from virtual reality around 1990 and came to public attention in 1998. Around this time in the US when broadcasting football matches on television, they started using a digital yellow line to indicate where the First Down was on the pitch. In the following decade AR-technology started popping up everywhere, a lot of applications were built around the tech for various different purposes. It was used in the military to display more information in fighter jet cockpits or was used for marketing purposes, by placing a QR code on products which could be then scanned and the product “came alive” with a short 3D video.

In 2014 Google introduced ‘Google Glass’, which is a wearable hands-free AR device with a transparent display. The first iteration of this gesture and voice controlled device was still very unsophisticated.The ways these glasses could be exploitative by being able to record video 24 hours a day in public caused quite a big public uproar. Initially Google suspended the development, and relaunched it later only for corporate users.

100% Credit: Google

AR’s future

As of right now AR is a very exciting technology, but what does the future hold? Facebook in collaboration with Ray Ban recently released its new smart glasses. There have been AR solutions in development at Apple too and they are planning to release their headset and smart glasses. The expansion of 5G networks will also make support easier for AR-experiences, because this way complicated computing processes can be routed to be done in cloud via the network, and the actual hardware won’t need to have that much computing power. Because of this, smaller and more fashionable devices can be made.

The spread of LiDAR brings more realistic AR-products to our phones. Apple has integrated this LiDAR technology into its devices from iPhone 12 upwards and the iPad Pro. With LiDAR (Light Detection and Ranging) we can map our environment in 3D which can also increase the AR-capabilities of a device. This can give a sense of depth to AR-creations, this way making them blend more into the real physical space. It also allows for occlusion which means that any actual physical object in front of the AR-object will cover the AR-object.

AR in the Apple ecosystem

ARKit is Apple’s development platform for mobile devices running iOS. With the help of ARKit, developers can produce realistic AR-experiences for iPads and iPhones. This kit first became available with the release of iOS 11 and since then, it has become much more advanced. It allows for filling and enhancing physical spaces with digital content. AR applications using the ARKit have better, more stable AR-tracking and more detailed content than other apps running on iOS, since this ARKit utilizes all aspects of Apple hardware.

ARKit can also be used to interpret real-time facial expressions and based on that data can produce virtual 3D characters. Many might be familiar with Snapchat lenses or Memojis which both use this technology. The 3rd edition of ARKit also has Motion Capture, with which not only facial expressions but body movements can be recorded. The other major function next to the one mentioned above was the inclusion of occlusion. Both of these improvements mean that now AR contents can have a more realistic appearance.

The ARKit monitors the accelerometer, the gyroscope and the spatial movement of the device all the while mapping the immediate environment. Combining all these, we can get a very accurate 3D-map of our environment.

Simply put, ARKit is the eye of any AR application, it tells developers what the device sees in the real world. Let’s see with what kind of framework we can augment reality!


RealityKit is the newest addition to Apple’s rendering technology family. This is a high level framework which was first released in 2019 in order to make developing an AR application easier. The RealityKit was made purposefully for AR/VR projects, with simplified settings for multi-user experiences and can be run on iOS and macOS systems as well. It also does multithreaded rendering. RealityKit only supports Swift and not Objective-C and it also only reads .usdz, .rcproject and .reality file formats.

RealityKit provides high quality rendering technology and up-to-date AR abilities. It also supports LiDAR scanning. It supports photogrammetry tools with which we can generate 3D models from pictures that we took.


SceneKit is also a high-level framework. It was released in 2012 and is the oldest rendering technology that Apple offers. SceneKit was primarily designed for the development of 3D games, but later received VR support. For AR projects it can be only used together with ARKit and can be run on iOS/macOS systems. SceneKit supports Objective-C and Swift too. Its main advantage is that compared to RealityKit it is highly customizable. It can read multiple file formats, including .usdz, .dae és .obj formats.

Apple hasn’t updated SceneKit since 2017.


Metal is basically a 3D graphics API. This is a very low level framework, which was released in 2014. All frameworks mentioned before are built on Metal - RealityKit, SceneKit and ARKit too, but it also can be used by itself as a rendering tool for 3D graphics.

From a developer’s point of view, it is much more complicated and is more time-consuming, but in turn it is more customizable than SceneKit and is also much quicker. Developers usually use it for games with complex 3D environments or for applications with large amounts of data required for scientific research.

Developer experience

We were in such a fortunate position that one of Zengo's customers entrusted us with an AR-related task. The goal was to develop an interior design application optimized mainly for iPad. On the digital objects placed in space, many different transformations can be performed, for example moving, rotating, scaling. The results of placing and altering objects can also be saved and then reloaded on location.

At the beginning of the development, the first important step was to select the right framework, which renders the AR objects in addition to ARKit. We had three options to choose from: RealityKit, SceneKit and Metal. Because of the aforementioned advantages we chose to use RealityKit. This allowed us to speed up the workflow and the many pre-integrated development support functions made the work of developers, less experienced in 3D, easier.

The next step was figuring out how to search for the digital objects that we want to place in the environment using the app. The first hurdle here was that RealityKit does not support the dominating 3D modeling file formats like .obj or .fbx. We can only work with USDZ models. However, Apple does provide a converter app with which we can generate USDZ files from most 3D file formats. After converting, managing the USDZ files is quite easy on Apple devices, as most of them support it by default and we can preview them through the camera without having to install any applications.

For placing the objects in space we have to import them into the app first. These can be files saved on the device, or we can implement a file explorer with which we can search between files saved locally or on iCloud. We can also open USDZ files with the app, received via iMessage or on e-mail and thus import them. If importing was successful we have to load them into the memory, which depending on the file sizes can take 3-4 seconds. The objects are always placed in space relative to the center of the screen and to make it easier to find the right position, we added an indicator square that shows exactly where the object is placed on the plane.

The transformations are given by 3D vectors in space and they must also be taken into account in the case of rotation, scaling and when moving the object. Their values can be compared to the center of the world, which in every case is the camera of the device or another virtual object. The pivot point, which defines which point of the model the transformations are in relation to, is usually located at the bottom of Apple objects. This can be problematic when we are using an external object with the pivot point elsewhere, because within the framework we can’t move this pivot point. One solution for this is to package the object into another object which has the pivot point at the bottom, so that the transformations we dial in behave like every other transformation. The dimensions and distances can be specified in meters. We can work easily this way, since in real life we also use meters. This way of measuring is very accurate in case of cameras with LidAR, but even older, one camera systems, can work relatively accurately.

The most intriguing part was making saving and loading work. Our goal was to be able to reload the app and the objects after saving, picking up from where we left off with the objects retaining their positions within the physical space. For this we had to use a so-called anchor system set in space: we had to find objects in the scanned space which could then serve as reference points for the virtual objects. The real object serving as a point of reference has to be noticeably different from the given plane for the camera, for example a glass or a pillow can work, but a small pebble is not enough. This is because the scanning process requires intensive computing, and we only make a rough simplified map of the space.


The anchor object does not need to be close to the object we want to place. If there is only a sofa in a room, then we can use that as anchor point and the application is able to use it as a reference point in space and place the object farther away from it. The identification of anchor points goes unnoticed by the user, they only see a status indicator whether the app is done mapping the space or not. If this is done, this scanned 3D map needs to be saved: this does not contain the objects nor their transformation, it only contains the positions of the anchors. When loading, if we can scan the same space and the system recognizes the anchors, we then have to reset the object and their transformation within the scene-

A huge limitation we encountered with ARKit is that we don't have access to the device's camera. We can’t zoom in and we can’t access the Ultra-wide camera either. When using ARKit, rather than seeing what the camera sees, we are getting a cropped image. This is probably because this way hardware stabilization can be further supported by digital stabilization, hence the crop. This way the objects can stay stable in most cases even if our hands are shaking.

The biggest challenges came from the fact that there is very little documentation available, so we had to experiment with most things ourselves. Furthermore we had to focus on code quality, as well as high grade optimization as prolonged usage can overload the iPad, which can then cause lagging and glitches that can ruin user experience.

Finally, after a lot of testing, the first version was completed, which won the approval of our colleagues and our customer too.


AR is definitely a very exciting and interesting field and it certainly has a great future ahead, but it still needs to evolve. Hardware is only just beginning to reach the computing capacity needed to provide the right user experience. If hardware will be able to consistently handle intensive computing tasks efficiently, this tech will probably become integral to our daily lives.

Apple is considered to be a pioneer in providing mobile AR experiences, not only for users, but developers too. They provide software frameworks which make developing significantly easier, but AR in general is still apparently a juvenile technology, at least at Apple for sure, because there is very little documentation and resources available.

Did you like our article? Would you like to read more on similar topics?

The blog post was written and edited by K. Tibor.