No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionInsight HD 6.5.0 Product Description 02

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Flink CEP in SQL

Flink CEP in SQL

Flink CEP in SQL

CloudStream is extended to allow users to express CEP queries for pattern matching in SQL and evaluate them on Flink engine for stream of events.

SQL Query Syntax

This is achieved via the MATCH_RECOGNIZE SQL syntax. MATCH_RECOGNIZE clause, supported by Oracle SQL starting from Oracle Database 12c, enables to express event pattern matching in SQL. Apache Calcite also supports MATCH_RECOGNIZE clause.

Since Flink parses SQL queries using Calcite, our implementation follows Apache Calcite syntax for CEP presented below.

MATCH_RECOGNIZE (
      [ PARTITION BY expression [, expression ]* ]
      [ ORDER BY orderItem [, orderItem ]* ]
      [ MEASURES measureColumn [, measureColumn ]* ]
      [ ONE ROW PER MATCH | ALL ROWS PER MATCH ]
      [ AFTER MATCH
            ( SKIP TO NEXT ROW
            | SKIP PAST LAST ROW
            | SKIP TO FIRST variable
            | SKIP TO LAST variable
            | SKIP TO variable )
      ]
      PATTERN ( pattern )
      [ WITHIN intervalLiteral ]
      [ SUBSET subsetItem [, subsetItem ]* ]
      DEFINE variable AS condition [, variable AS condition ]*
      )

The syntactic elements of the MATCH_RECOGNIZE clause are defined as follows:

- PARTITION BY [optional]: defines the partition column. This clause is optional. If not defined, the degree of parallelism 1 is used.

- ORDER BY [optional]: defines the order of events in the data stream. The ORDER BY clause is optional and non-deterministic ordering is used if omitted. However, since the order of events is important in pattern matching, this clause should be specified in most of the cases.

- MEASURES [optional]: specifies the computed attribute values of events that successfully matched.

- ONE ROW PER MATCH | ALL ROWS PER MATCH [optional]: defines how to output the results. ONE ROW PER MATCH indicates that the output has only one row for each match while ALL ROWS PER MATCH indicates that the output has one row for each event of each match.

- AFTER MATCH [optional]: specifies from where to start processing for the next pattern match after a successful match is found.

- PATTERN: defines the matching pattern as a regular expression format. The following operators can be used in the PATTERN clause: concatenation, quantifiers (*, +, ?, {n}, {n,}, {n,m}, {,m}), branching (using vertical bar `|`), and exclusion ( `{- -}` ).

- WITHIN [optional]: outputs a pattern clause match if and only if the match occurs within the specified time duration.

- SUBSET [optional]: groups together one or more correlation variables that are defined in the DEFINE clause.

- DEFINE: specifies a boolean condition that defines the variables used in PATTERN clause.

In addition, the following functions are supported:

- MATCH_NUMBER(): can be used in MEASURES clause to assign the same number to each row of the same successful match.

- CLASSIFIER(): can be used in MEASURES clause to indicate the mapping relationship between the rows in a match and variables.

- FIRST() and LAST(): can be used in MEASURES clause to return the value of an expression evaluated in the first/last row of the group of rows mapped to a pattern variable.

- NEXT() and PREV(): can be used in DEFINE clause to evaluate an expression using a previous/next row in a partition.

- RUNNING and FINAL keywords: can be used to determine the desired semantics for an aggregate. RUNNING can be used in MEASURES and DEFINE clauses, while FINAL can only be used in MEASURES clause.

- Aggregate function (COUNT, SUM, AVG, MAX, MIN): these aggregate functions can be used in MEASURES clause and DEFINE clause.

Query examples

The following query finds the V-shape patterns in stock prices data stream.

    SELECT *
    FROM MyTable
    MATCH_RECOGNIZE (
      ORDER BY rowtime  
      MEASURES
        STRT.name as s_name,
        LAST(DOWN.name) as down_name,
        LAST(UP.name) as up_name
      ONE ROW PER MATCH
      PATTERN (STRT DOWN+ UP+)
      DEFINE
        DOWN AS DOWN.v < PREV(DOWN.v),
        UP AS UP.v > PREV(UP.v) 
    ) 

In the following query the aggregate function AVG is applied in the MEASURES clause on the SUBSET E formed of A and C correlation variables.

    SELECT * 
    FROM Ticker 
    MATCH_RECOGNIZE ( 
      MEASURES
        AVG(E.price) AS avgPrice 
      ONE ROW PER MATCH 
      AFTER MATCH SKIP PAST LAST ROW 
      PATTERN (A B+ C) 
      SUBSET E = (A,C) 
      DEFINE 
        A AS A.price < 30, 
        B AS B.price < 20,
        C AS C.price < 30 
    )
Download
Updated: 2019-05-17

Document ID: EDOC1100074548

Views: 3941

Downloads: 37

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next