Measuring Performance of Hermes in React Native

Facebook announced Hermes, the new JavaScript Engine at the keynote at Chain React 2019. During the talk, we saw some videos and numbers, showing Hermes and the current JavaScript engine side by side. This blog post dives deeper into those experiments and how those experiments were structures. The goal is to make the experiments reproducible and verifiable.

Goal

The goal of the experiments is to understand how the different JavaScript engines impact the performance of React Native. Currently, we can run React Native with JSC, Hermes and V8.
While benchmarks like Octane compare JavaScript execution in isolation, these experiments test the engine in a mobile environment, where engine startup and first run matter.

The Apps

To ensure that the tests represent real world scenarios, we run end to end scenarios on apps that sit on either end of the spectrum.
  1. Chain React Conference app  - A small scale React Native app with minimal network dependencies but a fair amount of content.
  2. Mattermost app - A real world app, with complex interactions, layouts and functionality used regularly by many people, but for business reasons.
All these apps needed to be open source for me to be able to modify their source code, and for others to be able to run the tests.

Test Scenario

Cold Start was the only scenario tested to get metrics like cold start time, memory consumption, APK size and the entire startup timeline. Scenarios like animations, scroll perf and navigation have dependencies on the native environment, and may not be a good for testing different JavaScript engines.

Setup

Upgrade to 0.60: At the time of testing, ChainReact and MatterMost were not on React Native 0.60. The first order of business was to upgrade them to the latest version. React Native 0.60 for Android comes with non trivial changes (like AndroidX), and automated scripts were used to facilitate the upgrade. Commits for the upgrade are available in ChainReact and Mattermost repo forks.

Adding Instrumentation: Instrumentation was to both the apps, to get data points on the Java start up path, as well as the mount times for components. Instrumentation was added based on a previous blog post and the performance data was sent to a server running on port 3000. This work is available as commits in ChainReact and MatterMost repo forks.

Network Isolation: To remove flakiness due to network,  the networking module was overridden with an OKHTTP point to a proxy. The proxy saves all responses needed for TTI and replays them with equal delays for all tests.

Test Script: The tests were run 100 times on an Android Pixel device for each app, for each engine. The tests were automated, and the app was launched using intents. The test concluded when the instrumentation server received data. The script would then run adb shell dumpsys meminfo to understand the memory usage.

Results

Of the 100 iterations, the 75th percentile TTI was chosen, and here are the results.

Mattermost Mobile App

These were the numbers shown at Marc's keynote, and compared JSC and Hermes.
Looking at the traces, we can see that both cases had similar amount of work done during the initial phases when the Java code executes. The divergence is visible when React Native is executed and further visible as React runs the reconciliation to mount components.
The APK size links also show the tree maps of sizes, and the large difference can be attributed to the fact that Mattermost needed to use the "intl" version of JSC.

ChainReact App

The Chain React app was tested across the three VMs (V8, JSC and Hermes), and here are the results from the 75th percentile.


The APK Size, timeline and memory usage can be found in this list. Similar to the mattermost app, the timelines diverge during the JavaScript execution phase. The APK size is also different, primarily due to the engine embedded.

Conclusion

While I have tried to make the tests representative of real world apps, I would highly recommend instrumenting and testing your own apps for results and pick the right JavaScript engine for your React Native app. V8 supports snapshots, and JSC just landed the ability to have bytecode, and I am working on adding those variations to the tests too.
I am also working on getting Hermes working with the react-native-js-benchmark. In the meantime, here are some ideas that you could use to improve the performance of your React Native app.