单品页统一服务系统架构

如果无法正常显示，请先停止浏览器的去广告插件。

1. 单品页统一服务系统架构

2. 架构单品页依赖服务众多，分布在各个部门问题：服务质量没有监控数据出现问题不能及时降级接口调用分散化域名重复解析，没有长连接的优势 www.jd.com -2-

3. 架构总体原则 • 设计上无状态 • 使用 nginx+lua+tomcat7 架构 • 充分利用 local cache （ proxy cache or shared_dict or java guava cache ） • 使用 local twemproxy 做 redis 分片，且缓存分离（重要业务与其他业务分离） • 分离线程池，核心业务与非核心业务分离，且有线程池监控和开关 • 异步化更新 • Redis 集群使用主从架构 • 使用 unix domain socket 减少连接数 • 使用 keepalive 长连接 • 考虑好开关缓存时间、是否调用后端服务、托底（托底恢复采用指数 / 随机数恢复机制） www.jd.com -3-

4. Twemproxy+Redis • 使用 local twemproxy 做 redis 分片，且缓存分离（重要业务与其他业务分离） • Redis 集群使用集中式主从架构 / 考虑复制缓冲区大小 • 考虑使用 unix domain socket / 套接字放内存文件系统 cache_unix_domain_socket_slave: listen: /tmpfs/redis_slave.sock -- 主 local masterDomainsocket = "unix:/dev/shm/redis_master.sock" -- 从 local slaveDomainsocket = "unix:/dev/shm/redis_slave.sock“ local status, resp = redis_get(slaveDomainsocket, key, "nginx.harmony." .. umpKey .. ".redis.get") if status == STATUS_ERR then status, resp = redis_get(masterDomainsocket, key, "nginx.harmony." .. umpKey .. ".redis.get") end • 考虑使用 Hash Tag • Redis 考虑缓存驱逐策略 www.jd.com maxmemory-policy allkeys-lru maxmemory-samples 10 -4-

5. Nginx keep alive • 使用长连接，并限制长连接数量 #keepalive keepalive_requests 2000; keepalive_timeout 8s; upstream backend_tomcat { server 172.16.108.128:1601 max_fails=2 fail_timeout=30s weight=5; server 172.16.108.128:1601 max_fails=2 fail_timeout=30s weight=5; check interval=3000 rise=1 fall=2 timeout=2000 type=tcp default_down=false; keepalive 50; } # 到后端 tomcat location ~ ^/backend/(.*)$ { internal; uninitialized_variable_warn off; access_log off; proxy_pass_request_headers off; proxy_hide_header Vary; # 支持 keep-alive proxy_http_version 1.1; proxy_set_header Connection ""; proxy_set_header Host web.c.3.local; proxy_pass http://backend_tomcat/$1$is_args$args; } www.jd.com -5-

6. Nginx timeout • 设置超时 #proxy timeout proxy_connect_timeout 100ms; proxy_read_timeout 500ms; proxy_send_timeout 1s; #lua socket timeout lua_socket_connect_timeout 100ms; lua_socket_read_timeout 500ms; lua_socket_send_timeout 1s; www.jd.com -6-

7. Nginx proxy cache • 使用内存文件系统进行 Nginx Proxy Cache mount tmpfs /tmpfs -t tmpfs -o size=1g proxy_temp_path proxy_cache_path /tmpfs/proxy_temp; /tmpfs/proxy_cache levels=1:2 keys_zone=cache:200m inactive=5m max_size=8g; proxy_cache cache; proxy_cache_valid 200 5s; proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504 updating; • 注意响应头对 cache 时间的影响 Parameters of caching can also be set directly in the response header. This has higher priority than setting of caching time using the directive. •The “X-Accel-Expires” header field sets caching time of a response in seconds. The zero value disables caching for a response. If the value starts with the @ prefix, it sets an absolute time in seconds since Epoch, up to which the response may be cached. •If the header does not include the “X-Accel-Expires” field, parameters of caching may be set in the header fields “Expires” or “Cache-Control”. •If the header includes the “Set-Cookie” field, such a response will not be cached. •If the header includes the “Vary” field with the special value “*”, such a response will not be cached (1.7.7). If the header includes the “Vary” field with another value, such a response will be cached taking into account the corresponding request header fields (1.7.7). www.jd.com -7-

8. Nginx proxy cache • 数据有问题不要缓存 set $no_cache ""; proxy_cache abc; proxy_cache_valid 200 3m; proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504 updating; proxy_no_cache $no_cache; location @fallback { set $no_cache 1; echo <!DOCTYPE html><html><head><title> 京东全球购 </title><meta http-equiv="X-UA-Compatible" content="IE=Edge" /></head><frameset cols="100%"><frame src="$fallb ack_url" /></frameset></html>; } www.jd.com -8-

9. Nginx DNS • proxy_pass 时使用 Local DNS 解析 #dns resolver( 使用 dnsmasq) resolver 127.0.0.1 valid=5s; resolver_timeout 200ms; proxy_pass http://p.3.local/prices/get$is_args$args; 可能解析到多个 server （ nslookup ），会自动 next upstream Nginx plus 支持 upstream 的动态解析 www.jd.com -9-

10. Nginx Gzip • 根据自己需求设置 gzip_comp_level 、 gzip_min_length 、 gzip_types gzip gzip_min_length gzip_buffers gzip_http_version gzip_proxied gzip_comp_level gzip_types gzip_vary www.jd.com on; 1k; 16 16k; 1.0; any; 4; text/plain application/x-javascript text/css application/xml; on; - 10 -

11. Nginx upstream • upstream 检查 upstream backend_tomcat { server 172.16.108.128:1601 max_fails=2 fail_timeout=30s weight=5; server 172.16.108.128:1601 max_fails=2 fail_timeout=30s weight=5; check interval=3000 rise=1 fall=2 timeout=2000 type=tcp default_down=false; keepalive 50; } • upstream 策略 ip_hash upstream backend_tomcat { ip_hash; server 172.16.108.128:1601 max_fails=2 fail_timeout=30s weight=5; server 172.16.108.128:1601 max_fails=2 fail_timeout=30s weight=5; check interval=3000 rise=1 fall=2 timeout=2000 type=tcp default_down=false; keepalive 50; } www.jd.com - 11 -

12. Nginx upstream • upstream 策略 hash key [consistent]; upstream nginx_server { hash $args_skuId consistent; server 127.0.0.1 max_fails=2 fail_timeout=30s weight=5; server 127.0.0.1 max_fails=2 fail_timeout=30s weight=5; check interval=3000 rise=1 fall=2 timeout=2000 type=tcp default_down=false; keepalive 50; } server { listen server_name #access_log access_log log_subrequest error_log 80; c.3.cn c2014.3.cn c.3.local; /export/servers/nginx/logs/c.3.cn/c.3.cn_access.log main; off; off; /export/servers/nginx/logs/c.3.cn/c.3.cn_error.log warn; location / { access_log off; proxy_set_header Host nginx.c.3.local; proxy_pass http://nginx_server; } } www.jd.com - 12 -

13. Nginx Real ip real_ip_header J-Forwarded-For; real_ip_recursive onl set_real_ip_from 192.168.0.0/16; set_real_ip_from 172.0.0.0/8; VIP ： 192. 开头、 172 开头的都是内网其他的可以说都是公网，比如 211. VIP: 单 VIP 单机房绑定单 VIP 多机房绑定 www.jd.com - 13 -

14. Nginx client header client_header_buffer_size 4k; large_client_header_buffers 4 4k; client_body_buffer_size 128k; client_max_body_size 1m; proxy_method GET; proxy_pass_request_body off; proxy_pass_request_headers off; www.jd.com - 14 -

15. Nginx limit • limit request http://www.nginx.cn/446.html www.jd.com - 15 -

16. Nginx limit • limit connection / limit rate http://www.nginx.cn/446.html www.jd.com - 16 -

17. Nginx limit • ip 白名单 / 黑名单 • user-agent 白名单 / 黑名单 • Token 限流 http://drops.wooyun.org/tips/734 • 漏桶算法令牌桶算法 (Java Guava rate limit) http://www.cnblogs.com/LBSer/p/4083131.html http://leyew.blog.51cto.com/5043877/860302 • Delay 限速 • www.jd.com - 17 -

18. Nginx+Lua shared_dict • 使用共享字典做 local cache lua_shared_dict harmony_local_accessories_cache 200m; lua_shared_dict harmony_local_suit_cache 200m; lua_shared_dict harmony_local_combination_cache 200m; local function find_local_cache_strategy(key) local cache = nil local ttl = nil if string_find(key, ACCESSORIES_KEY_PREFIX, 1, true) then cache = local_accessories_cache ttl = 1 * 60 -- 本地缓存 1 分钟 elseif string_find(key, SUIT_KEY_PREFIX, 1, true) then cache = local_suit_cache ttl = 1 * 60 -- 本地缓存 1 分钟 elseif string_find(key, COMBINATION_KEY_PREFIX, 1, true) then cache = local_combination_cache ttl = 1 * 60 -- 本地缓存 1 分钟 end if not cache then return nil, nil end return cache, ttl end local function local_cache_get(key) local cache, ttl = find_local_cache_strategy(key) if not cache then return nil end return cache:get(key) end www.jd.com - 18 -

19. Nginx+Lua 接口合并 • 请求时使用 method 参数表示请求哪个服务 c.3.cn/g? method=price,promise,ads,stock&skuId=1541281061&skuIdKey=F0AF023018C93BA4CC4FEB9791CC72B8 &area=1_72_2799_0&cat=652,828,6882&extraParam={"originid":"1"}&venderId=116906&buyNum=1&vTy pe=1&callback=abc • 数据过滤逻辑前置，不合法直接 403( 防止 XSS) local local local local skuId = getSkuId(args['skuId']) skuIdKey = getSkuIdKey(args['skuIdKey']) cat = getCat(args['cat']) area = getArea(args['area']) • 封装调用逻辑，参数顺序等固定，提升命中率 local apis = {} if methods['price'] then apis['price'] = { method = 'price', callback = "G_setPrice", url = "/prices/get", args = "skuId=J_" .. skuId .. "&type=1&area=" .. area[1] .. "_" .. area[2] .. "_" .. area[3] } end www.jd.com - 19 -

20. Nginx+Lua 接口合并 • 通过 Nginx 子请求（ ngx.location.capture_multi ）进行合并 • 只对原子接口进行 Cache • 通过一层代理重试或者记录 UMP 日志 local local local local local proxy_uri = var.proxy_uri method = var.method resp = capture(proxy_uri) status = resp.status body = resp.body local request_time = tonumber(var.request_time) * 1000 # 代理，主要作用记录 UMP 还有失败重试 location ~ ^/proxy/(\w+)(/.*)$ { internal; uninitialized_variable_warn off; access_log off; set $method $1; set $proxy_uri $2$is_args$args; lua_code_cache on; content_by_lua_file "/export/App/c.3.cn/lualib/jd/proxy.lua"; } --(500 502 503 504) 失败且耗时在 200ms 以内重试一次 if (status == 502 or status == 503 or status == 504) and request_time < 200 then resp = capture(proxy_uri) status = resp.status body = resp.body request_time = request_time + tonumber(var.request_time) * 1000 end www.jd.com - 20 -

21. Nginx+Lua 记录日志 • 记录日志还可以通过 set_by_lua_file $beginTime "/export/App/c.3.cn/lualib/jd/ump_proxy_log.lua"; log_by_lua_file "/export/App/c.3.cn/lualib/jd/ump_proxy_log.lua"; www.jd.com - 21 -

22. Java 架构 • 异步非阻塞事件模型从 Servlet3 开始支持异步模型， Tomcat7/Jetty8 开始支持，相同的概念是 Jetty6 的 Continuations 。我们可以把处理过程分解为一个个的事件。我们可以把处理过程分解为一个个的事件。我们可以把处理过程分解为一个个的事件。通过这种将请求划分为事件方式我们可以进行更多的控制。我们可以把处理过程分解为一个个的事件。如，我们可以为不同的业务再建立不同的线程池进行控制：即我们只依赖 tomcat 线程池进行请求的解析，对于请求的处理我们交给我们自己的线程池去完成；这样 tomcat 线程池就不是我们的瓶颈，造成现在无法优化的状况。我们可以把处理过程分解为一个个的事件。通过使用这种异步化事件模型，我们可以提高整体的吞吐量，不让慢速的 A 业务处理影响到其他业务处理。我们可以把处理过程分解为一个个的事件。慢的还是慢，但是不影响其他的业务。我们可以把处理过程分解为一个个的事件。通过这种将请求划分为事件方式我们可以进行更多的控制。我们可以把处理过程分解为一个个的事件。如，我们可以为不同的业务再建立不同的线程池进行控制： www.jd.com - 22 -

23. Java Tomcat • start.sh export JAVA_OPTS="-Djava.library.path=/usr/local/lib -server -XX:-UseConcMarkSweepGC -XX:+UseCMSCompactAtFullCollection - XX:CMSInitiatingOccupancyFraction=80 -XX:+CMSParallelRemarkEnabled -XX:SoftRefLRUPolicyMSPerMB=0 -XX:MaxDirectMemorySize=512m -Xss256k - XX:NewRatio=1 -XX:SurvivorRatio=6 -Xms16384m -Xms16384m -XX:MaxPermSize=256m -Djava.awt.headless=true - Dsun.net.client.defaultConnectTimeout=60000 -Dsun.net.client.defaultReadTimeout=60000 -Djmagick.systemclassloader=no - Dnetworkaddress.cache.ttl=300 -Dsun.net.inetaddr.ttl=300 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$CATALINA_BASE/logs - XX:ErrorFile=$CATALINA_BASE/logs/java_error_%p.log" -XX:+UseConcMarkSweepGC 表示使用 CMS -XX:+CMSParallelRemarkEnabled 表示并行 remark -XX:+UseCMSCompactAtFullCollection 表示在 FGC 之后进行压缩，因为 CMS 默认不压缩空间的。我们可以把处理过程分解为一个个的事件。 -XX:CMSInitiatingOccupancyFraction=80 设置阀值为 80% ，默认为 68% 。我们可以把处理过程分解为一个个的事件。 -XX:SoftRefLRUPolicyMSPerMB softly reachable objects will remain alive for some amount of time after the last time they were referenced. The default value is one second of lifetime per free megabyte in the heap -XX:NewRatio 年轻代 ( 包括 Eden 和两个 Survivor 区 ) 与年老代的比值 ( 不包括持久代 ) -XX:SurvivorRatio Eden 区与 Survivor 区的大小比值 www.jd.com - 23 -

24. Java Tomcat • server.xml <Connector port="1601" asyncTimeout="10000" acceptCount="10240" maxConnections="10240" acceptorThreadCount="1" minSpareThreads="5" maxThreads="5" redirectPort="8443" processorCache="1024" URIEncoding="UTF-8" protocol="org.apache.coyote.http11.Http11NioProtocol" enableLookups="false"/> 以 Tomcat 6 为例，其 Connector 有几个关键配置： BIO 实现： acceptCount ：在超过最大连接数时，可接受的排队数量；超过这个值就直接拒绝连接；默认 100 ； maxThreads ： tomcat 可创建的最大线程数，没线程处理一个请求，它决定了 tomcat 最大线程阀值；默认 200 ； minSpareThreads ：最小备用线程数，即 tomcat 一启动就创建的线程数；默认 25 ； ( 使用 Executor 时配置 ) maxQueueSize ：最大备用线程数，一旦创建的线程超过这个值 tomcat 就会关闭不活动的线程；默认 Integer.MAX_VALUE ； ( 使用 Executor 时配置 ) NIO 实现（继承如上的配置）： acceptorThreadCount ：接受连接的线程数；默认 1 ，可以根据 CPU 核数调整；如果没有问题默认 1 个即可，基本不需要改； pollerThreadCount ：运行选择事件的线程个数；默认每核一个； processorCache ：协议处理器缓存 Http11NioProcessor 对象的个数，目的是提高性能，默认 200 ，建议其值接近 maxThreads ；对于 tomcat7 的相关配置可以参考官网 http://tomcat.apache.org/tomcat-7.0-doc/config/http.html；核心差不多。我们可以把处理过程分解为一个个的事件。 www.jd.com - 24 -

25. Java servlet3 <bean id="commonAsyncContext" class="com.jd.noah.base.web.DynamicAsyncContext"> <property name="asyncTimeoutInMillis" value="20000"/> <property name="poolSize" value="32-1000"/> <property name="keepAliveTimeInMills" value="5000"/> <property name="queueCapacity" value="8192"/> </bean> @RequestMapping("/combination") public void getCombination( HttpServletRequest request, final Integer skuId, final Integer c1, final Integer c2, final Integer c3, final Integer lid, final Integer lim, final Integer p, final String pin, final String uuid) throws Exception { commonAsyncContext.submit(request, new Callable<String>() { @Override public String call() throws Exception { return UMPs.monitor("get.combination.info", new UmpCallable<String>() { @Override public String call() throws Exception { return combinationService.getCombinationInfo( skuId, c1, c2, c3, lid, lim, p, pin, uuid); } }); } }); } www.jd.com - 25 -

26. Java thread pool • 线程池并发执行任务获取数据 <task:executor id="asyncTaskExecutor" pool-size="${async.executor.pool.size}" queue-capacity="${async.executor.queue.capacity}" keep-alive="${async.executor.keep.alive.size}"/> // 获取库存状态 inStockSkusFuture = asyncTaskExecutor.submit(new Callable<Map<Long,Boolean>>() { @Override public Map<Long,Boolean> call() throws Exception { return wareStockInfoSafService.getInStockStateMap(fittingIds); } }); // 获得商品数据 ( 包括主商品和推荐商品 ) fittingSkusFuture = asyncTaskExecutor.submit(new Callable<Map<Long, Map<String, Object>>>() { @Override public Map<Long, Map<String, Object>> call() throws Exception { return getFittingProducts(mainSkuIdAndFittingIds, Sets.newHashSet("name", "imagePath", "category", "state")); } }); inStockSkuMap = inStockSkusFuture.get(2000, TimeUnit.MILLISECONDS); fittingsSkuMap = fittingSkusFuture.get(2000, TimeUnit.MILLISECONDS); www.jd.com - 26 -

27. Java async • 异步更新缓存 asyncTaskExecutor.execute(new Runnable() { @Override public void run() { try { cache.set(combinationInfoKey, combinationInfoJsonFromApi, getExpiresInMillis("combination.redis.expire.millis")); } catch (Exception e) { LOG.error("update combination to redis error: {}", combinationInfoKey, e); } } }); www.jd.com - 27 -

28. Java cache • local cache guava private LoadingCache<Integer, Map<String, String>> skuSortCache = CacheBuilder.newBuilder() .softValues() .maximumSize(10000) .expireAfterAccess(5, TimeUnit.MINUTES) .build(new CacheLoader<Integer, Map<String, String>>() { @Override public Map<String, String> load(Integer o) throws Exception { return Constants.NULL_MAP; } }); www.jd.com - 28 -

29. Java cache • 批量接口时，对单个数据进行缓存；首先查询单个缓存，然后对 miss 数据进行批量获取，最后合并为结果 public Map<Integer, Map<String, String>> getSkuSort(Set<Integer> sortIds) throws Exception { Map<Integer, Map<String, String>> hitSortMap = skuSortCache.getAll(sortIds); Set<Integer> missSortIds = Sets.newHashSet(); for(Integer sortId : sortIds) { if(MapUtils.isEmpty(hitSortMap.get(sortId))) { missSortIds.add(sortId); } } Map<Integer, Map<String, String>> missSortMap = queryProductSort(missSortIds); skuSortCache.putAll(missSortMap); Map<Integer, Map<String, String>> resultSortMap = Maps.newHashMap(); resultSortMap.putAll(hitSortMap); resultSortMap.putAll(missSortMap); return resultSortMap; } www.jd.com - 29 -

30. 其他 • 域名分区客户端同域连接限制，进行域名分区： c.3.cn c1.3.cn c2.3.cn • 充分使用 CPU ，比如绑定 CPU 核数 • 考虑减少连接数 • 考虑使用内存文件系统 • 考虑大内存或企业级 SSD • 全部使用弹性云 www.jd.com - 30 -

31. 谢谢！ www.jd.com - 31 -