TiDB 实践 HTAP 的架构进展和未来展望
如果无法正常显示,请先停止浏览器的去广告插件。
1. The HTAP architecture design of TiDB,
and the improvement in v6.2
Wei Wan @ PingCAP
2.
3. About Me
Wei Wan, work at PingCAP, as the leader of OLAP Storage team.
Over 11 years of experience in game, e-commerce, mobile apps, and
database development.
4. TiDB Introduction
TiDB is an open-source NewSQL database that supports HTAP workloads. It
is MySQL compatible and features horizontal scalability, strong consistency,
and high availability.
The goal of TiDB is to provide users with a one-stop database solution that
covers OLTP (Online Transactional Processing), OLAP (Online Analytical
Processing), and HTAP services.
5. Agenda
1. A typical user case
2. The challenges to storage module on HTAP scenario
3. The improvements in TiDB v6.2
4. TiDB's future architectural evolution direction
6. TiDB Core Architecture
7. A typical user case of HTAP workloads
8. ZTO - 运单系统 (Exadata to TiDB)
9. ZTO - 运单系统 (Exadata to TiDB)
10. ZTO - 运单系统 (Exadata to TiDB)
• Unlimited Scalability
• Not more sharding
“分库分表达到 16000 张表,业务上已经无法再继续扩展下去”
• Greatly reduced the workload of business development
• The latency of report reduced from 30m to 2m
• Reduced the cost of database servers.
• Bigdata infrastructure adaption
11. The challenges to storage module on HTAP scenario
12. Isolation between OLTP and OLAP workloads
• Isolation is difficult if we mix them in the same node
• TP and AP scale separately
• Different hardware requirements
• Different best data structures.
• Row based vs column based
• Index
13. Isolation between OLTP and OLAP workloads
14. Isolation between OLTP and OLAP workloads (HATtrick Bench)
15. Realtime synchronization and strong consistency
Raft Learner Read
16. Realtime update of columnar store
The big issue is the balance between READ and WRITE.
• High frequency update brings fragmentation
• Too many files slows down write speed
• Columnar store needs to split columns into different files
• Too many IOPS
• Transaction support needs to store multiple versions
• Sort Merge is slow
17. Realtime update of columnar store
The solution of DeltaTree, the storage engine of TiFlash
• Introduces row versions
• Transform updates and deletes into append operation
• Use WAL and Mem-table, to batch updates into small groups
• To avoid too many IOPS and files
• Adopts Delta + Stable Layer architecture
• Each layers using different storage strategies
• Accelerates Sort Merge by DeltaIndex
18. The write path of DeltaTree
19. Adopting DeltaIndex to accelerate scan speed
The scan speed of DeltaTree is 3x of
ClickHouse in SELECT … FINAL
20. The improvements in TiDB v6.2
21. PageStorage before 6.2
V1
V2
22. PageStorage in 6.2
V3
23. PageStorage in 6.2
In a typical HTAP workload,
• AP QPS improves 30%
• Peak CPU usage decreases from 3000% to 2500%
• Peak Memory usage decreases from 28GB to 18GB
• Peak write throughput deceases over 30%
24. DataSharing
25. TPCH performance improvement
26. TiDB's future architectural evolution direction
27. TiDB Core Architecture
28. Evolution towards cloud native
Goal:
• Better cost efficiency
• Faster scalability
• Higher availability
Direction:
• Cloud native
• Take fully advantage of cloud infrastructures
• Resource pool
29. Cloud native
• S3: unlimited capacity and low cost storage
• Only stateless and light state nodes could scale fast
• Write and read path decoupling
• Low cost
• Stable service
• Scale separately
30. Resource pool
• For database serverless users
• Reduce idle resources
• Any size of TiDB clusters can benefit from resource
pool
• Dynamic scale in or out, to achieve consistent
performance with less cost
• Security
• Multi-tenants, VM instance level
• Authentication and encryption to protect user data
31.
32.