这篇文章将为大家详细讲解有关Cloudera流分析中如何引入FlinkSQL,文章内容质量较高,因此小编分享给大家做个参考,希望大家阅读完这篇文章后对相关知识有一定的了解。
SELECT
userId,
COUNT(*) AS count,
SESSION_START(clicktime,
INTERVAL '30' MINUTE)FROM clicks
GROUP BY
SESSION(clicktime, INTERVAL '30' MINUTE)
userId
1) 在流媒体领域中可以用SQL制定多少业务逻辑?
2) 这如何改变从开发到生产的流式作业旅程?
3) 这如何影响数据工程团队的范围?
CREATE TABLE ItemTransactions (transactionId BIGINT,`timestamp` BIGINT,itemId STRING, quantity INT, event_time AS CAST(from_unixtime(floor(`timestamp`/1000)) AS TIMESTAMP(3)),WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND) WITH ('connector.type' = 'kafka','connector.version' = 'universal','connector.topic' = 'transaction.log.1','connector.startup-mode' = 'earliest-offset',' connector.properties.bootstrap.servers' = '<broker_address>','format.type' = 'json');
SELECT * FROM ItemTransactions LIMIT 10;SELECT TUMBLE_START(event_time, INTERVAL '10' SECOND) as window_start, itemId, sum(quantity) as volumeFROM ItemTransactionsGROUP BY itemId, TUMBLE(event_time, INTERVAL '10' SECOND);
SELECT * FROM (
SELECT * ,
ROW_NUMBER() OVER (
PARTITION BY window_start
ORDER BY num_transactions desc
) AS rownum
FROM (
SELECT TUMBLE_START(event_time, INTERVAL '10' MINUTE) AS window_start, itemId, COUNT(*) AS num_transactions
FROM ItemTransactions
GROUP BY itemId, TUMBLE(event_time, INTERVAL '10' MINUTE)
)
)
WHERE rownum <=3;
关于Cloudera流分析中如何引入FlinkSQL就分享到这里了,希望以上内容可以对大家有一定的帮助,可以学到更多知识。如果觉得文章不错,可以把它分享出去让更多的人看到。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。