MinIO High Performance Object Storage
如果无法正常显示,请先停止浏览器的去广告插件。
1. High Performance
Object Storage
WHITE PAPER
2. Executive Summary
MinIO is a high performance, distributed object storage system. By following the methods
and design philosophy of hyperscale computing providers, MinIO delivers superior performance
and massive scalability to a wide variety of workloads in the private cloud.
While MinIO is ideal for traditional object storage use cases like secondary storage, disaster
recovery and archiving, it truly excels in overcoming the challenges of executing high-performance
computing against massive datasets.
In the modern enterprise these consist of machine learning, analytics and cloud-native
application workloads.
Because MinIO is purpose-built to serve only objects, a single-layer architecture achieves all of
the necessary functionality without compromise. The advantage of this design is an object
server that is high-performance and lightweight.
MinIO is a pioneer in the development of cloud-native object storage, refining and perfecting
many of the features, protocols and APIs that have come to define best in class. This is
evidenced by the more than 210M Docker pulls, 15K+ GitHub stars and the thousands of
production deployments across five continents.
This paper details the philosophical approach and technical attributes of MinIO and why those
attributes are important to any enterprise seeking to develop or migrate to an object storage
centric, microservices architecture across the public and private cloud.
High Performance Object Storage
02
3. The Enterprise Challenge
How enterprises store, access, move and analyze data is undergoing massive change. Driven by
the storage and compute efficiencies made possible by disaggregation, enterprises are finding
that their investments in traditional storage solutions like Hadoop HDFS are now obsolete.
The weapon of elite hyperscalers, disaggregation offers multiple benefits, but the two largest
are economics and performance oriented use cases like machine learning and advanced
analytics. As a result, enterprises are rearchitecting their data infrastructures to take advantage
of this separation.
Figure 1: The modern, disaggregated architecture
The reasons are straightforward. File and block protocols are complex, have legacy architectures
that impede innovation, are limited in their ability to scale or are compromised from a
performance perspective. Examples of these limitations include the aforementioned aggregation
of compute and storage but also include replication, security, encryption and data mobility.
The winner in this transformation is cloud-native, object storage.
Storage as a Service or STaaS is the second-fastest growing cloud workload worldwide,
representing a USD 4.8 billion annual market. Data is growing exponentially every year and
by 2025, experts predict that the world will create and replicate 163 zettabytes (ZB) of data.
The vast majority of that will be unstructured or semi-structured.
Fueling that growth is a focus on big data applications, Internet of Things (IoT) and artificial
intelligence (AI) workloads. These workloads demand high rates of throughput, excellent data
integrity, and a cost-effective deployment model.
High Performance Object Storage
03
4. Simple, powerful and with unlimited scalability, modern object storage has moved out of backup
and into the application and analytic workflow. A reduced set of storage APIs, accessed over
HTTP RESTful services mean that these cloud-native solutions are lightweight enough to be
packaged with the application stack.
Figure 2: The advantages of modern object storage
The Philosophy Of The Cloud
MinIO combines the inherent advantages of object storage with a robust suite of features, a
stunningly simple, intuitive interface and an expansive set of integrations.
MinIO is unique in that it was built from the ground up with cloud-native technologies to be
simple, fast, durable and highly scalable. With the belief that a complex solution cannot be
scalable, a minimalist design philosophy forms the foundation of the MinIO architecture design.
The result is a system that is excels across several key dimensions:
Performance. With its focus on high performance, MinIO enables enterprises to support
multiple use cases with the same platform. For example, MinIO’s performance characteristics
mean that you can run multiple Spark, Presto, and Hive queries, or to quickly test, train and
deploy AI algorithms, without suffering a storage bottleneck. MinIO object storage is used as the
primary storage for cloud native applications that require higher throughput and lower latency
than traditional object storage can provide.
High Performance Object Storage
04
5. Scalability. A design philosophy that “simple things scale” means that scaling starts with a
single cluster which can be federated with other MinIO clusters to create a global namespace,
spanning multiple data centers if needed. Gradual expansion of the namespace is possible by
adding more clusters, more racks, and even by adding more data centers to the MinIO single
namespace. MinIO leverages the hard won knowledge of the web scalers to bring a simple
scaling model to object storage.
Simplicity. Minimalism is a guiding design philosophy at MinIO. Simplicity reduces opportunities
for errors, improves uptime, delivers reliability while serving as the foundation for performance.
MinIO can be installed and configured within minutes simply by downloading a single binary and
then executing. The amount of configuration options and variations is kept to a minimum which
results in near-zero system administration tasks and few paths to failures. Upgrading MinIO is
done with a single command which is non-disruptive and incurs zero downtime - lowering total
cost of ownership.
High Performance Object Storage
Every feature of MinIO’s object storage suite was architected to deliver performance and scale.
As a software-defined solution, MinIO can be paired with hundreds of different compute and
storage configurations from Intel Skylake or Xeon Gold processors to NVMe drives, spinning
disk - even tape.
Client
Client
Client
Network
MinIO
MinIO
MinIO
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
NVMe
Figure 3: A typical MinIO deployment.
MinIO’s software defined object storage suite consists of a server, an optional client, and an
optional software development kit (SDK):
MinIO Server
MinIO is a distributed object storage server released under Apache License v2.0.
It boasts the most comprehensive implementation of the Amazon S3 API to be found anywhere
outside of Amazon itself. MinIO is feature-complete, providing enterprise-grade encryption,
identity management, access control, and data protection capabilities, including erasure code
and bitrot protection
High Performance Object Storage
05
6. MinIO Client
Called mc, the MinIO Client is a modern and cloud-native alternative to the familiar UNIX
commands like ls, cat, cp mirror, diff, find and mv. This client provides advanced functionality
that is suitable for web-scale object storage deployments. For example, powerful data
replication tools work between multiple sites for HA (highly availability) and DR (disaster
recovery) purposes and support generating shared, time-bound links for objects.
MinIO SDKs
The MinIO Client SDKs provide simple APIs to access any Amazon S3-compatible object storage.
MinIO repositories on Github offer SDKs for popular development languages such as Golang,
JavaScript, .Net, Python and Java.
The features of MinIO’s Object Server are notable for their breadth, depth and focus on the
enterprise. As a cloud-native implementation, the range of features exceed those in legacy or
bolt-on implementations while the attention to engineering first principles ensure exceptional
performance.
S3 Select
To deliver big data, analytic and machine learning workflows requires filtered access to the data -
grabbing just what is relevant to a particular job.
MinIO has developed its own implementation of the S3 Select API which is essentially SQL query
capabilities baked right into the object store. Users can execute SELECT queries on their objects,
and retrieve a relevant subset of the object, instead of having to download the whole object.
With the S3 Select API, applications can now download a specific subset of an object — only the
subset that satisfies given SELECT query. This directly translates into efficiency and performance
by reducing bandwidth requirements, optimizing compute and memory resources meaning more
jobs can be run in parallel — with same compute resources. As jobs finish faster, there is better
utilization of analysts and domain experts.
Erasure Coding
MinIO protects data with per-object, inline erasure coding which is written in assembly code to
deliver the highest performance possible. MinIO uses Reed-Solomon code to stripe objects into
n/2 data and n/2 parity blocks - although these can be configured to any desired redundancy
level. This means that in a 12 drive setup, an object is sharded across as 6 data and 6 parity
blocks. Even if you lose as many as 5 ((n/2)–1) drives, be it parity or data, you can still reconstruct
the data reliably from the remaining drives. MinIO’s implementation ensures that objects can be
read or new objects written even if multiple devices are lost or unavailable.
Erasure code protects data without the high storage overhead of using RAID configurations or
data replicas. For example, RAID-6 only protects against a two-drive failure whereas erasure
High Performance Object Storage
06
7. code allows MinIO to continue to serve data even with the loss of up to 50 percent of the drives
and 50 percent of the servers
Finally, MinIO applies erasure code to individual objects, which allows the healing of one object
at a time. For RAID-protected storage solutions, healing is done at the RAID volume level, which
impacts the performance of every file stored on the volume until the healing is completed.
export-xl
Disk1
MyBucket
MyObject
Disk2
MyBucket
MyObject
Disk3
MyBucket
MyObject
Disk4
MyBucket
MyObject
xl.json
part.1
xl.json
part.1
xl.json
part.1
xl.json
part.1
Figure 4: Erasure code protects data without the overhead associated with alternative approaches..
BitRot Protection
Silent data corruption or bitrot is a serious problem faced by disk drives resulting in data getting
corrupted without the user’s knowledge. The reasons are manifold (aging drives, current spikes,
bugs in disk firmware, phantom writes, misdirected reads/writes, driver errors, accidental
overwrites) but the result is the same - compromised data.
MinIO’s optimized implementation of the HighwayHash algorithm , ensures that it will never
read corrupted data - it captures and heals corrupted objects on the fly. Integrity is ensured from
end to end by computing hash on READ and verifying it on WRITE from the application, across
the network and to the memory/drive. The implementation is designed for speed and can achieve
hashing speeds over 10 GB/sec on a single core on Intel CPUs.
Figure 5: MinIO’s data protection schemes cover failure and silent data corruption..
High Performance Object Storage
07
8. Identity and Access Management
MinIO supports the most advanced standards in identity management, integrating with the
OpenID connect compatible providers as well as key external IDP vendors. That means that
access is centralized and passwords are temporary and rotated, not stored in config files and
databases. Furthermore, access policies are fine grained and highly configurable which means
that supporting multi-tenant and multi-instance deployments become simple.
3
IDENTITY
PROVIDER
(IdP)
1
APPLICATION
4
2
5
Figure 6: Identity protection and single sign on (SSO) are critical enterprise features. .
Encryption and WORM
It is one thing to encrypt data in flight it is another to protect data at rest. MinIO supports
multiple, sophisticated server-side encryption schemes to protect data - wherever it may be.
MinIO’s approach assures confidentiality, integrity and authenticity with negligible performance
overhead. Server side and client side encryption are supported using AES-256-GCM, ChaCha20-
Poly1305 and AES-CBC. Encrypted objects are tamper-proofed with AEAD server side
encryption. Additionally, MinIO is compatible with and tested against all commonly used Key
Management solutions (e.g. HashiCorp Vault).
MinIO uses key-management-systems (KMS) or cryptographic key management system (CKMS)
to support SSE-S3.If a client requests SSE-S3, or auto-encryption is enabled, the MinIO server
encrypts each object with a unique object key which is protected by a master key managed by
the KMS. Given the exceptionally low overhead, auto-encryption can be turned on for every
application and instance.
When WORM is enabled, MinIO disables all APIs that can potentially mutate the object data and
metadata. The means that data once written becomes tamper-proof. This has practical
applications for a number of different regulatory requirements.
High Performance Object Storage
08
9. Figure 7: Encryption and WORM protect data in flights and at rest..
Global Federation
The modern enterprise has data everywhere. MinIO allows those various instances to be
combined to form a unified global namespace. Specifically, up to 32 MinIO servers can be
combined into a Distributed Mode set and multiple Distributed Mode sets can be combined into
a MinIO Server Federation. Each MinIO Server Federation provides a unified admin and
namespace.
A MinIO Federation Server supports an unlimited number of Distributed Mode sets.
The impact of this approach is that an object store can scale massively for large, geographically
distributed enterprise while retaining the ability to accommodate a variety of analytical
approaches (S3 Select, MinSQL, Spark, Hive, Presto, TensorFlow, H20) from a single console.
There are multiple benefits to MinIO’s cluster and federation architecture:
Each node is an equal member of a MinIO cluster. There is no master node.
Each node can serve requests for any object in the cluster, even concurrently.
Each cluster uses a Distributed Locking Manager (DLM) to manage updates and deletes
to objects.
The performance of an individual cluster remains constant as you add more clusters to the
federation.
Failure domains are kept within the cluster. An issue with one cluster does not affect the
entire federation.
High Performance Object Storage
09
10. When deploying a cluster, it is recommended that you use a programmable domain name service
(DNS), such as coreDNS, to route HTTP(S) requests to the appropriate cluster. Also, use a load
balancer to balance the load across the servers in a cluster. Global configuration parameters can
be stored and managed in etcd (an open-source distributed key-value store).
Figure 8: Global federation enables almost infinite scalability
Multi-Cloud Gateway
All enterprises are adopting a multi-cloud strategy.
To support hybrid cloud initiatives, MinIO can be deployed in gateway mode to leverage public
cloud resources. Leveraging the same binary, MinIO enables companies to run their applications
on premises or in the public cloud with no modification. This minimizes operational overhead, and
provides flexibility to move data and applications as business requirements change, not locking
into a specific cloud provider or proprietary architecture. To achieve this requires that your bare-
metal virtualization containers and public cloud services (including non-S3 providers like Google,
Microsoft and Alibaba) look identical. MinIO runs on bare metal, network attached storage and
every public cloud. More importantly, MinIO ensures your view of that data looks exactly the same
from an application and management perspective via the Amazon S3 API.
MinIO, can go even further, making your existing storage infrastructure compatible with Amazon
S3. The implications are profound. Now organizations can truly unify their data infrastructure -
from file to block, all appearing as objects accessible via the Amazon S3 API without the
requirement for migration.
High Performance Object Storage
10
11. Figure 9: Gateway mode is designed to make every cloud and NAS look like S3.
Continuous Replication
The challenge with traditional replication approaches is that they do not scale effectively beyond
a few hundred TB. Having said that, everyone needs a replication strategy to support disaster
recovery (DR) and that strategy needs to span geographies, data centers and clouds. MinIO’s
continuous replication is designed for large scale, cross data center deployments. By leveraging
Lambda compute notifications and object metadata it can compute the delta efficiently and
quickly.
Lamba notifications ensure that changes are propagated immediately as opposed to traditional
batch methods. Continuous replication means that data loss will be kept to a bare minimum
should a failure occur - even in the face of highly dynamic datasets. Finally, like all that MinIO
does, continuous replication is multi-vendor, meaning that your backup location can be anything
from NAS to the public cloud.
Figure 10: MinIO’s continuous replication approach safeguards even dynamic data
High Performance Object Storage
11
12. Metadata Architecture
MinIO has no separate metadata store. All operations are performed atomically at object level
granularity. This approach isolates any failures to be contained within an object and prevents
spillover to larger system failures. Each object is strongly protected with erasure code and bitrot
hash. You can crash a cluster in the middle of a busy workload and still not lose any data.
Another advantage of this design is strict consistency which is important for distributed machine
learning and big data workloads.
Cloud Native
The multi-instance, multi-tenant design of MinIO enables Kubernetes-like orchestration
platforms to seamlessly manage storage resources just like compute resources. Each instance of
MinIO is provisioned on demand through self-service registration. Traditional storage systems
are monolithic and compete with Kubernetes resource management. MinIO is lightweight and
container friendly so you can pack many tenants simultaneously on the same shared
infrastructure.
Lambda Function Support
MinIO supports Amazon compatible Lambda event notifications which enables applications to
be notified of individual object actions such as access, creation, and deletion. The events can be
delivered using industry standard messaging platforms like Kafka, NATS, AMQP, MQTT,
Webhooks, or a database such as Elasticsearch, Redis, Postgres, and MySQL.
Benchmark Performance: S3 Bench
Performance claims require context and benchmarks. MinIO tests against a number of different
benchmarks from S3 to DFS.io and TPC. The following represents the summary results of our S3
Bench testing on commodity and high performance hardware. Full documentation of the testing,
setup and environments can be found on MinIO’s website.
Our HDD results running on 16 node Minio cluster were:
Setup Avg Read Throughput (GET) Avg Write Throughput (PUT)
Distributed 10.81 GB/s 8.57 GB/s
Distributed with
Encryption 9.38 GB/s 6.91 GB/s
Our NVMe results running on an 8 node MinIO cluster were:
Setup Avg Read Throughput (GET) Avg Write Throughput (PUT)
Distributed 38.7 GB/s 34.4 GB/s
Distributed with
Encryption 36.9 GB/s 34.6 GB/s
High Performance Object Storage
12
13. An Enduring Commitment to Open Source
MinIO operates under the Apache V2 license. The company’s products are 100% open source. It is
MinIO’s commitment that as long as it is independent that it will continue to be 100% open source.
The advantages of Open Source are well documented. These include the avoidance of vendor
lockin, security, consistent innovation, transparency, and the reliability that comes with millions
of community members hammering every release from every possible angle.
MinIO remains the owner of the MinIO object storage project and as such controls the quality
and development through its weekly release cadence. MinIO runs a suite of acceptance tests for
every pull request and every MinIO server release.
Understanding the SUBNET Subscription Offering
While MinIO is available under the open source Apache V2 license, many customers choose to
purchase the software on an annual subscription basis. Their reasons for doing so differ, but
they are unified in the value they see in the software coupled with a desire to have a deeper
relationship with the team behind MinIO.
The SUBNET subscription includes the object storage suite and all new features, individually
tracked and prioritized security updates and bug fixes, advanced diagnostics, real time
engineering support, customized release management and support for older, production
implementation of MinIO.
SUBNET provides an extra measure of assurance to our enterprise customers with production
deployments ranging from Terabyte to Exabyte.
Conclusion
MinIO is the fastest growing object storage system in the world for a reason. It was designed
from scratch to be a key part of the modern data stack solving critical problems for enterprises
while seamlessly integrating with its existing data and application infrastructure. It delivers
performance, scalability and simplicity alongside an enterprise grade feature set.
As importantly, MinIO is 100% open source with all of the attendant benefits. Finally, for our
production customers we offer the security that comes from a direct engagement with MinIO
engineering via the SUBNET subscription offering on a subscription basis.
The result is the industry’s most comprehensive solution for the rapidly growing world of object
storage.
High Performance Object Storage
13
14. About MinIO
Founded in 2014, MinIO is now the world’s fastest growing object storage system. Backed by
some of the smartest minds in storage and venture capital including Nexus, General Catalyst,
Dell Technologies Capital, Intel Capital, AME Cloud Ventures and key angel investors, the
company has raised $23.3M through its Series A round.
Additional Information
Email: hello@Min.io
MinIO Inc.
530B University Avenue,
Palo Alto, CA 94301
© 2019 MinIO, Inc.
Resources
https://min.io
https://docs.min.io./
https://blog.min.io./
*numbers in map as of October 2019