No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionInsight HD 6.5.0 Product Description 02

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Configuration Tables

Configuration Tables

In some scenarios, users have fixed configuration tables that store basic information. Flink needs to be configured to match configuration tables when receiving stream data. Redis is recommended for storage because the configuration table may be of large size. Redis is a high-performance key-value database with low query latency for stream data.

Storage process of configuration tables in Redis

Figure 4-9 Storage process of configuration tables in Redis

Overview of Redis

Redis is a data structure server supporting various types of values, in addition to key-value storage. Following data types are supported by Redis:

  • Binary-safe character string.
  • Lists: A collection of string elements sorted by their insertion order. The collection is in the form of linked lists.
  • Sets: Disordered collection of string elements without repetition.
  • Sorted sets: Each string element is associated with a floating number value called score. Elements are sorted by score and can be searched.
  • Hashes: The map that consists of fields and related values. Fields and values are character strings.
  • Bit arrays: You can process strings as a series of bits by running certain commands. For example, you are allowed to configure and clear certain bits, calculate the number of bits that are configured to 1, and find the first bit that is configured to 1 or 0.
  • HyperLogLogs: A probabilistic data structure which is used to estimate the cardinality of a set

    Redis clusters are used to store configuration tables containing a maximum of 100 million pieces of data, enabling quick query response. Asynchronous I/O of streams is used to query messages, improving throughput of the data processing.

NOTE:
  • Redis cluster: In a Redis cluster, Redis is deployed on all nodes in the cluster and data is stored on all nodes with high storage capacity. FusionInsight provides Redis components.
  • Asynchronous I/O: Asynchronous I/O is used to processes data with maximized data processing throughput and efficiency.
Operations on Redis are as follows:
  1. Install Redis.

    When installing clusters, you can select the Redis component provided by FusionInsight.

  2. Import configuration tables to Redis.

    You are allowed to select the main key or multiple key columns as the keys based on the feature of the configuration table. If to-be-stored configuration tables contain a large number of attributes, you are advised to storage them in the hashes data format.

    The Redis provided by FusionInsight provides Jedis client for inserting queries.

NOTE:

For details about Redis types, see the official website at https://redis.io/topics/data-types-intro.

Asynchronous I/O

When Flink interacts with external systems, such as external databases, the waiting time for responses is too long, reducing data processing efficiency. Asynchronous I/O enables sent requests without the need to receive responses.

Following requirements are required for achieving the API of asynchronous I/O:

  • The asyncInvoke method of AsyncFunction function needs to be rewritten, in order to implement asynchronous data processing.
  • Callback function obtains operator results and AsyncCollector collects the obtained results.
    Figure 4-10 Comparison of Sync. I/O and Async.I/O
  • Timeout period and maximum capacity need to be properly configured.

    Timeout period defines the maximum allowed period for an asynchronous request. The maximum capacity refers to the maximum concurrent number of asynchronous requests. You are advised to configure maximum capacity based on data source features, because an improperly large value will cause high resources consumption and an improperly small value will reduce the throughput.

Download
Updated: 2019-05-17

Document ID: EDOC1100074548

Views: 3986

Downloads: 37

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next