Query In Place

Presto is a high-end, distributed SQL system for big data technology operating with state-of-the-art data querying infrastructure that works with various data sources.
machine learning
“Presto is incredibly fast due to its optimized query engine and is well suited for interactive analysis.”
Praveen Murugesan
Engineering Manager at Uber

What is Presto?

Presto (now known as either Trino or Presto) is a SQL system originally built by Facebook developers in 2012 to act as a high-performance, heavy-duty SQL system for large data centers. The project was made available to the open source community in 2013 to modify and use for their specific needs.

In January of 2019 the Presto Software Foundation was announced.  At the same time, the development of Presto forked – PrestoDB maintained by Facebook and Presto SQL maintained by the Presto Software Foundation. In September of that same year Facebook donated PrestoDB to the Linux Foundation. However the original committers and top contributors rebranded PrestoSQL as Trino. 

Both open source projects handle massive quantities of data with SQL queries in a multi-layered, scalable, and efficient manner. Presto outperforms other SQL engines because it works with almost all common data sources such as Hadoop, AWS S3, Alluxio, MySQL, Cassandra, Hive, MongoDB, Teradata, and many more. 

Another thing that makes Presto unique is its querying system. It allows the software to query multiple data sources within the same query, actively streamlining the efficiency and performance of the engine.

Presto is an exciting development. Originally designed to handle large data centers, Presto is becoming the de facto  SQL query engine for big data.

Big data is still in its infancy but it requires a programmable, adaptable, and efficient SQL engine to handle queries across such a wide infrastructure – all of which are flagship features of Presto.

Presto Benefits

The SQL Engine of Tomorrow, Today

Presto is equipped to handle data sources of any type or size, and since it’s open-source software, it’s easy to manage, change, and adapt to for any professional and personal query use.

Lighting fast analytics

Presto is built from the ground up to favor quality, quantity, and speed – by adding Presto to your arsenal, you can make fast queries and get reports and analytics in no time at all.

Rummage through multiple data sources

Most SQL engines are focused on one or two data sources, but Presto is equipped to handle all of them. Presto can work with all Apache data sources and can be modified to handle even more of them – it’s also NoSQL compliant.

Multi-layered Queries

Presto can query multiple data sources with just one query, enabling you to extract information quickly and efficiently.

Cloud friendly

Presto can be applied to your existing cloud infrastructure with ease.

Huge community

Presto is open-source software with a buzzing developer community working on improving the software daily. You might be able to find solutions that haven’t been implemented in the software version just by rummaging through the community forums.

Straightforward integration

Presto is relatively simple and lightweight but incredibly useful and powerful. Its simplicity allows you to integrate it into your existing framework or ecosystem quickly.

Open-Source Design

Lastly, and most importantly, Presto is open source. It can be adapted to fit almost any criteria and accomplish any SQL querying efficiently, making it both cost-effective and efficient in operations of all sizes.

Presto Features

Created by potrace 1.10, written by Peter Selinger 2001-2011

Query anything

Presto can query any data source, a traditional SQL framework, NoSQL, relational databases, and proprietary data stores.

Created by potrace 1.10, written by Peter Selinger 2001-2011

Pipeline executions

Presto has a unique data querying system built-in, which allows it to keep latency at a minimum by avoiding as much I/O latency overhead as possible.

Created by potrace 1.10, written by Peter Selinger 2001-2011

Fully customizable

Since Presto is a piece of open-source software, in-house teams can modify and change it to add any function they want.

Created by potrace 1.10, written by Peter Selinger 2001-2011

Vectorized processing

Presto comes with vectorized columnar processing features that allow it to operate on a wide set of values distributed into columns, further adding to the sophistication of the software.

Created by potrace 1.10, written by Peter Selinger 2001-2011

Connector support

You can plug in a connector and sync it with Presto to get detailed metadata on whatever it’s doing.

Created by potrace 1.10, written by Peter Selinger 2001-2011

Query optimization

Presto automatically optimizes all queries it sends out due to its multi-layered system, allowing it to access more information at a faster rate.

Created by potrace 1.10, written by Peter Selinger 2001-2011

Superb analysis

Presto can provide on-site analysis through its powerful centralized interface, define a direct data feed, and present it in numerical values while it queries in real-time.

Created by potrace 1.10, written by Peter Selinger 2001-2011


Since Presto keeps absolutely no cache on it unless you program this feature into the code, you don’t have to worry about underlying data source contamination.

Created by potrace 1.10, written by Peter Selinger 2001-2011

Top-shelf scalability

Presto can automatically manage resources and scale them up and down depending on the demand.

Do you want to implement the world’s premiere SQL querying solution in your framework? There’s no reason to wait – signing up takes no time at all.

Presto vs. Apache Hive

Presto was initially developed to overtake Apache Hive, as Hive couldn’t perform SQL queries with speed and finesse. Still, some companies and people use Hive, so we’ve decided to show just how better Presto is. If you want to see it in action, book here for a quick demo. 

Faster Speed – Presto operates at a much faster rate than Apache Hive when it comes to querying. That’s because Presto has no internal cache mechanism built-in, works on a unique multi-layered system, and is optimized to pull data from numerous data sources.

Straightforward pipeline – When working with Hive, you’ll have to wait for in-between stages to get your data. With Presto’s direct pipeline, you cut down on the waiting time considerably.

No memory – Presto doesn’t keep any memory onboard, including cache. Apache Hive, on the other hand, does, which significantly impacts its performance.

Open Source – Both are open-source, but Presto has a far more active community working on streamlining and improving the software every day. Hive is reserved for major corporate applications and companies with in-house IT departments.

Push Model – Apache Hive pulls data from the data centers, while Presto pushes it out. It makes for a much more streamlined process that mitigates data loss and crashes, which aren’t common with Presto.

Integration Conundrum – Presto is known for its simple integration within an existing framework and can easily adapt. Hive, on the other hand, needs to be modified to adapt to any existing configuration. 

Scalability – Unlike Hive, Presto is built for scalability, making it a good tool for data centers of any size. Hive works best with larger data centers.

Cloud friendly – Presto can be applied to the cloud with relative ease, while Hive isn’t cloud-friendly.

See How Easy It Is To Start With Presto

Pandio’s Presto Consistently Outperforms Its Competitors

Presto delivers the fastest and most reliable SQL queries with superb scalability, an open-source design, and seamless integration

Companies Using Presto

Frequently Asked Questions

Pandio’s Presto is a powerful SQL engine that can handle anything from small-scale data centers to up-and-coming technologies such as big data. Presto is fully open source and can work with all kinds of data sources. It’s very reliable and highly scalable, making it ideal for all querying applications. Presto was developed by Facebook and was later donated to the Linux Foundation.

Presto is better than Hive in almost every way, the most important of which are:

  •       Presto doesn’t keep internal memory or cache.
  •       Unlike Hive, Presto has a direct pipeline data system, cutting down on latency.
  •       Presto is fully scalable and works well on small and large applications alike, while Hive only works well for large applications.
  •       The Presto open source code is maintained and improved by the community, and Hive doesn’t have anything near that level of support.


A wide range of companies and large enterprises use Presto daily, including:

  •   Atlassian
  •   Amazon
  •   Airbnb
  •   Facebook
  •   Netflix
  •   NASDAQ
  •   Bigin
  •   Gympass
  •   Amperity
  •   Walt Disney
  •   DBS C2E

Presto is used as a premier SQL engine for any application possible. It’s used to extract data from huge databases, query more than one application simultaneously, and act as the primary analytics tool. Companies also use Presto for data combinations, querying data from multiple sources, and running aggregations.

Presto is used across all industries. From financial giants over at NASDAQ to the entertainment megacorporation Walt Disney, Presto is used wherever there’s any querying and analytical need.

Presto has memory-to-memory transfer, which is one of its main selling points and the primary reason behind its speed. Unlike a traditional pipeline, Presto uses a direct hybrid pipeline that allows it to pull data from multiple sources simultaneously.

Presto, in its initial form, has ceased to exist. These days, the two versions of what Presto used to be are Presto, owned and developed by the Linux Foundation, and Trino, owned and developed by the community and the developers who worked Presto before it was donated to Linux.

The differences between the two are pretty minimal, but Trino seems to have a far more active community of developers behind it. Both versions share most metrics and features.

Want to See Presto in action?

Try Pandio’s fully managed Presto service for free and start querying across any data sources you have.
Slack Members
0 +
Data Sources
0 x
Growth Last Year