Universe大数据平台常用术语及含义

[复制链接]
liufc
liufc   中级会员    发表于 2016-10-17 11:00:36   最新回复:2016-10-17 11:00:36

中文 Simplified Chinese
英文
English
缩略语
Acronym/Abbreviation
中文定义/描述
Chinese Definition/Description
英文定义/描述
English Definition/Description
应用数据层 Application data store ADS 华为Universe大数据平台的融合数据模型中的应用数据层。应用层分析模型作为Universe大数据系统最上层的模型,是为Universe大数据系统业务需求提供分析支撑的模型,是面向需求以及未来需求变更、扩展的模型。 Application data store of the Convergent Data Model (CDM) on Huawei Universe ***ytics Platform. Data models at the Application Data Store (the top layer of the Universe) are oriented towards requirements, requirement changes, and requirement extensions, assisting service requirement ***ysis.
应用开发 Application Development   华为Universe大数据平台中统一分析开发平台的模块之一。提供专业的数据分析和展现引擎,用户可以通过图形化、拖拽式地操作方式,实现数据的统一建模和可视化设计。 A module of the Unified ***ytics Openness Platform (UAOP). The application development provides professional data ***ysis and display engines. Users can perform unified data modeling and visualized design by dragging controls or diagram elements.
开放平台统一门户 API Service Open Portal   华为Universe大数据平台中统一分析开放平台的模块之一。开放平台统一门户用于集中展现平台开放能力,为运营商的内外部用户提供统一的访问入口。 A module of the UAOP. The API service open portal uniformly displays capabilities opened by the platform and provides unified access for internal and external users of carriers.
API服务开放 API Service Openness   华为Universe大数据平台中统一分析开放平台的模块之一。通过向内外部系统提供API服务,实现数据资产和平台能力的开放,使第三方ISV利用统一分析开放平台提供API能构建百花齐放的应用,最大限度的发挥运营商大数据平台的价值。同时促进运营商构建开放式数字企业架构,向外提供服务功能以融入数字化世界,加速转型。 A module of the UAOP. API service openness provides API services for internal and external systems and implements openness of data assets and platform capabilities, which enables the third-party ISV to provide APIs to construct various applications using the UAOP, maximizing the values of carriers' big data platform and helping carriers construct an open digital enterprise architecture, provide digital services, and accelerate transformation. 
API服务网关 API Service Gateway   华为Universe大数据平台中统一分析开放平台的模块之一。API服务网关模块主要提供、API调用接入、API流控、鉴权认证、API计量计费、API路由、协议转换、格式转换、日志记录和监控告警的功能。 A module of the UAOP. The API service gateway module provides the following main functions: API invoking and access, API traffic control, Authentication and certification, API measurement and charging, API routing, Protocol conversion, Format conversion, Log recording, Monitoring and alarming.
API服务执行 API Service Execution   华为Universe大数据平台中统一分析开放平台的模块之一。API服务执行模块用于提供数据获取和安全控制功能。 A module of the UAOP. The API service execution module provides data obtaining and security control functions
基础标签 Basic Customer Telco Tags   分析传统电信业务方面的信息和客户的基本信息、行为及特征,形成客户的基础标签。 Traditional telecom information, behavior, and characteristics are ***yzed to generate basic customer telecom tags. 
批量决策 Batch Decision   提供灵活的执行周期,支持周期营销的批量决策。根据周期营销的执行计划,获取执行周期内最新客户信息进行营销决策。 Provides flexible execution periods, supports batch decision making for periodic marketing, and can obtain the latest customer information in an execution period for decision making based on the execution plan of periodic marketing.
批量数据处理 Batch Processing   Universe大数据分析平台数据集成模块基于B/S架构,利用分布式计算框架Map/Reduce和Spark实现海量数据处理。
批量数据处理支持海量数据的采集、关联、汇总计算;支持从普通文件系统或HDFS中抽取文本或XML文件,或将文本或XML文件加载到本地文件系统或HDFS中。
The Data Integration module of the Universe ***ytics Platform uses the browser/server (B/S) architecture and implements massive data processing based on the data storage and computing capabilities provided by the Hadoop cluster.
This feature supports ingestion, association, and aggregation of massive data. TTXT or XML files can be extracted from common file systems or the HDFS and loaded to a local file system or the HDFS.
基础数据层 Basic data store BDS 华为Universe大数据平台融合数据模型的基础数据层。基础数据层模型是对电信系统业务及其之间关联关系的数据描述。基础数据层模型是数据仓库建设的基础,一个统一、完整、灵活、稳定的基础数据层模型对大数据仓库项目的成功起着重要作用。 Basic data store of the CDM on Huawei Universe ***ytics Platform. Data models at the Basic Data Store describe telecom services and relationships between the services, and are the basis of data warehouse construction. Complete, flexible, and stable data models play important roles to the success of a big data warehouse project.
运营实体 Business entity BE 客户签约的运营主体,是客户的所有者。通常是携带市场运营策略的主体,且具备上下级层级关系。 An operation entity that signs contracts with customers. Generally, a business entity (BE) implements marketing strategies and is a subordinate or superior BE of another BE.
BookKeeper BookKeeper   BookKeeper是一个可靠地记录日志流的系统。 其作用是记录Write Ahead Log(操作具体数据结构之前先记录日志)。 A reliable system for recording log streams. It functions in the Write Ahead Log mode (generates record logs before processing data structures). 
标签 Tag   客户标签定义了一组可以复用的客户分群规则。例如,使用“零次用户”标签表示最近30天零次呼出呼入的用户,这些用户被认为是存在高流失可能性的客户。 Customer tags define a group of reusable customer segmentation rules. For example, the zero-usage tag identifies users that make and receive no call in the last 30 days. The churn rate is high among this type of customers. 
客户情景感知 Contextual Awareness Engine CAE Universe Digital Marketing的实时决策的内部功能模块。客户情景感知是基于上下文实时还原情景信息,深度洞察客户位置、互联网行为特征的实时处理引擎,构建基于上下文或关键时刻感知的实时决策系统。 A functional module of the Real-Time Decision in the Universe Digital Marketing. The Contextual Awareness Engine (CAE) restores scenario information in real time based on the context, achieves in-depth insight in features such as customers' locations and Internet behavior, and constructs context-based or moment awareness-based real-time decision making system.
营销管理 Campaign Management CampaignMgmt Universe数字化营销的模块。营销管理子系统帮助运营商构建一个智慧的端到端业务营销平台,通过与客户洞察子系统集成对接,将客户、产品、交易等数据进行有效整合和挖掘,在充分理解客户和市场的基础上,构建一个能支撑运营商业务营销的端到端整合系统。 A module of the Universe Digital Marketing. The Campaign Management subsystem integrates with and connects to the Customer Insight subsystem and efficiently integrates and mines customer, product, and transaction data, helping carriers construct an intelligent end-to-end service marketing platform.
复杂事件处理 Complex event processing CEP 基于事件驱动架构(EDA)的一门技术,它根据预定义或订阅的规则来发现可能本来不被注意的模式和趋势,从而使用户有能力建模、确认、预计及迅速响应看似无关的事件所带来的机遇和威胁。主要应用:算法交易 ,定量投资,风险管理,商业活动监控、群众智能、网络攻击、犯罪预防、系统动态校验等。 A technology based on the Event Driven Architecture (EDA). Based on predefined or subscribed rule, it identifies models or trends that could have been ignored, enabling users to construct models and quickly identify, estimate, and respond to the opportunities and threats caused by seemingly irrelevant events. Main application scenarios include algorithmic trading, quantitative investment, risk management, business activity monitoring, mass intelligence, network attack prevention, crime prevention, and dynamic system validation. 
融合信息层 Convergent information store CIS 华为Universe大数据平台融合数据模型中的融合信息层。融合信息层是面向业务领域的设计,每个业务领域的核心,我们称之为信息主体,信息主体及其主体活动共同构成了信息内容。 Convergent information store of the CDM on Huawei Universe ***ytics Platform. The design of the Convergent Information Store is service domain-oriented. The information subjects are cores of each service domain. Information content consists of information objects and activities of the objects.
客户知识管理 Customer Knowledge Management CKM 客户知识管理模块用于帮助运营商实现统一的客户信息管理和共享,并与运营商营销管理系统有效集成,最终帮助运营商实现精准业务营销和智慧运营。 Helps carriers to manage and share customer information in a unified manner and implements precision marketing and intelligent business operation by integration with the Campaign system provided by carriers. 
知识库 Corpus Repository   客户洞察套件内部模块。知识库,是知识工程中结构化、易操作、易利用、全面有组织的知识集群,是针对某一(或某些)领域问题求解的需要,采用某种(或若干)知识表示方式在计算机存储器中存储、组织、管理和使用的互相联系的知识片集合。 A structured, easy-to-operate, easy-to-use, and comprehensive knowledge cluster, and an interconnected knowledge set stored, organized, managed, and used on the computer using one or multiple knowledge representation methods aiming to solve problems of a specific domain.
Cube Cube   Cube是华为自主研发的一个内存数据库系统,用于用户数据分析或者用户数据高并发查询。其作用是加载客户统一视图到缓存,在进行客户分群、活动设计、活动执行时,可以从该服务器快速获取客户信息。 An in-memory database system developed by Huawei for user data ***ysis or highly-concurrent query. The Cube loads unified customer views to a cache and customer information can be quickly obtained from the Cube during customer segmentation, activity design, and activity execution. 
行为习惯 Customer Behavior Tags   分析用户的位置移动特征、业务特征、消费特征,通信特征等。通过了解客户的行为模式、购买习惯等行为习惯,运营商可以有针对性地刺激客户使用业务,为客户提供专属个性化优惠资源,提升营销效率和效果。 Customers' location change features, service features, consumption features, and communication features are ***yzed to understand customers' behavior patterns and purchase habits. Carriers can stimulate customers' service usage and provide personalized preferential resources for customers, improving marketing efficiency and effect.
客户洞察 Customer Insight   客户洞察子系统提供客户超细分分群能力,实时感知客户行为和属性变化,结合预测分析,实现对客户营销时机、行为偏好等全方位的洞察。 Provides the customer segmentation capability, perceives customer behavior and attribute change in real time, and achieves all-round insights into customer marketing moment and preference with forecast ***ysis. 
客户知识发现 Customer Knowledge Discovery   客户洞察的子模块之一。客户知识发现是个逻辑概念,用于从融合数据中发现客户知识,支撑客户特征数据的整合与分析。客户知识发现模块基于各种领域知识对用户行为进行标注,然后利用分析模型计算标签,计算完的标签结果同步到Cube中。 A sub-module of the Customer Insight subsystem. The Customer Knowledge Discovery module discovers customer knowledge from convergent data to support customer characteristic data integration and ***ysis. The Customer Knowledge Discovery module marks user behavior based on domain knowledge, calculates tags by ***ysis models, and synchronizes tag calculation results to the Cube. 
产品业务 Customer Product and Service Tags   分析客户的产品、渠道和内容的特征。通过了解客户的产品持有情况、业务使用情况和产品偏好等,可以对其进行个性化营销,提升客户的产品购买力和业务使用。 Customers' product, channel, and content characteristics are ***yzed to enable carriers to carry out marketing activities based on customers' product information, service usage, and product preference, and increase customers' product purchase and service usage.
人口轮廓 Customer Profile   分析客户的社会特征。通过获取客户的个人偏好、家庭背景和职业信息,更好地匹配客户特征和运营产品,精确的指导客户购买产品和使用业务。 Customers' social characteristics are ***yzed to obtain customers' preference, family background, and occupation information, facilitating better matching between customer characteristics and products and precisely instructing customers to purchase products and use services.
客户群洞察 Customer Segment Insight   包含OOTB基于场景的客户分群和灵活的客户分群。客户分群是进行准确市场细分和差异化营销策略制定的基础和前提。 Includes the OOTB Customer Segmentation and Segmentation. Customer segmentation is the prerequisite for segmenting the market accurately and formulating differentiated marketing strategies. 
客户分群 Customer Segmentation   客户分群是进行准确市场细分和差异化营销策略制定的基础和前提。客户洞察特性支持通过文件导入和规则创建两种方式进行客户分群。 The prerequisite for segmenting the market accurately and formulating differentiated marketing strategies. Customer segments can be created by importing a file or by creating rules.
标签服务 Customer Tag APIs   客户洞察对外提供API服务接口,支持标签的接入、查询和权限管理。 The Customer Insight subsystem provides APIs for tag access, query, and permission management for external systems. 
价值贡献 Customer Value Tags   分析客户历史给运营商的各项贡献,并预测用户未来的增长潜力。帮助运营商根据客户带来的价值提供差异化服务,做到聚焦高价值客户,同时提升长期客户,以节约成本。 Customers' contributions to carriers are ***yzed to forecast customers' growth potential and help carriers provide differentiated services based on customer value, focus on high-value customers, improve the value of long-term customers, and reduce cost.
数据探索 Interactive Self ***ysis Explorer ISAE 数据探索是Universe基础套件之一,面向角色为运营商的业务人员,用于解决日常业务事件驱动的自助式数据探索。主要业务能力包括数据提取、探索分析、应用发布。数据提取基于统一的元数据提供业务人员自助查询模型,自助设计取数的功能。探索分析用于支撑可视化探索、预测分析、仪表板组装,提供多种展现图表和分析方法,提供各类预测分析能力。每个探索分析都可以发布为应用,应用管理提供共享设置、数据集刷新调度等功能。 The ISAE is a basic module of the Universe ***ytics Platform, which provides service event-driven self-service data discovery for carriers' service personnel. The main capabilities of the ISA include data extraction, exploratory ***ysis, and application release. Data extraction capability provides the self-service query model for service personnel based on unified metadata. The service personnel then can design the self-service data obtaining function. Exploratory ***ysis supports visualized exploration, predictive ***ytics, dashboard creation, and provides multiple charts and ***ysis methods. In addition, various predictive ***ytics methods are supported. Each exploratory ***ysis can be released as an application. Application management provides functions including sharing configuration and data set update and scheduling. 
数据加解密 Data Encryption/Decryption   数据加密是实现数据脱敏的常用方法。被授权访问明文的用户,可通过数据解密服务获得数据的明文信息。
华为Universe大数据分析平台的统一数据治理平台对外提供了数据加解密服务。
This feature provides efficient data encryption and decryption algorithms, which improves the data security and privacy protection efficiency.
The Unified Data Governance Platform (UDGP) of Huawei Universe ***ytics Platform provides data encryption and decryption services for external systems. 
数据集成 Data Integration BDI 统一数据治理平台“数据集成”模块主要用于批量数据处理,利用分布式计算框架Map/Reduce或Spark引擎,实现对海量数据的抽取、转换和加载。 The Data Integration module of the UDGP is used to process data in batches and extract, transform, and load large amounts of data using the distributed computing framework, Map/Reduce or Spark engine.
数据获取 Data Obtaining   统一分析开发平台的“数据获取”模块,帮助业务运营人员摆脱对IT支撑部门的依赖,可通过图形化界面自助获取需要的数据。 The Data Obtaining module on the UADP enables service operation personnel to obtain required data using a graphical page without the help of IT support personnel. 
数据可视 Data Visualization   Universe大数据分析平台提供专业的数据分析和展现引擎,用户可以通过图形化、拖拽式地操作方式,实现数据的统一建模和可视化设计。 The Universe ***ytics Platform provides professional data ***ysis and display engines. Users can perform data modeling and visualized design by dragging controls or diagram elements.
DataNode DataNode   DataNode是Hadoop文件存储的基本单元,它将Block存储在本地文件系统中,保存了Block的Metadata,同时周期性地将所有存在的Block信息发送给NameNode。 Basic Hadoop file storage unit. A DataNode stores blocks and block metadata in the local file system, and periodically sends all block information to the NameNode. 
分布式数据库 Distributed database DDB 分布式数据库系统通常使用较小的计算机系统,每台计算机可单独放在一个地方,每台计算机中都可能有DBMS的一份完整拷贝副本,或者部分拷贝副本,并具有自己局部的数据库,位于不同地点的许多计算机通过网络互相连接,共同组成一个完整的、全局的逻辑上集中、物理上分布的大型数据库。 The distributed database system generally uses small computer systems. Each small computer can be placed independently, may contain a full or partial copy of the Database Management System (DBMS), and has a local database. Distributed computers can connect to each other through the Internet and constitute a complete and global large-scale database that is logically centralized and physically distributed. 
数据治理 Data Governance DG 借鉴资产管理的方法理论来管理数据,将数据作为一种特殊的资产,对进入平台的数据进行标准化的规范约束,并以元数据作为驱动,连接数据的标准管理、数据质量管理、数据数据安全管理的各个阶段,形成统一、完善的数据治理体系,以解决实际业务问题为导向,增强数据治理子系统对业务发展的支撑能力。 Asset management theories are referenced for data management. Considered as special assets, data to be stored on the platform must comply with standards. In addition, metadata is used as a drive that connects data standard management, data quality management, and data security management to form a unified and complete data governance system. This system aims to solve service issues and enhance the service development capability of the Data Governance subsystem. 
数据管理平台 Data Management Platform DMP 数据管理平台,是把分散的第一、第三方数据进行整合纳入统一的技术平台,并对这些数据进行标准化和细分,让用户可以把这些细分结果推向现有的互动营销环境里。 The Data Management Platform integrates carriers' data and third-party data into a unified technology platform, standardizes and classifies the data, and enable customers to apply the result data in existing interactive marketing environments. 
登录日志 Login log   是员工帐号登录或登出系统后,系统自动记录的员工帐号登录或登出的时间、地址等信息。 When an employ logs in to or out of the system, the system automatically records information about the login or logout time and location. 
定量数据分析 Quantitative data ***ysis   定量数据分析是依据统计数据,建立数学模型,并用数学模型计算出分析对象的各项指标及其数值的一种方法。 A method that constructs mathematical models based on statistics data and ***yzes object KPIs and values using the mathematical models. 
电子渠道覆盖 eChannels   针对运营商自有线上渠道做营销推送及交互策略。 Marketing messages and interaction policies can be transferred through carrier-operated online channels. 
E-R建模 Entity-relationship modeling   E-R建模是一种自顶向下的数据建模方法,先区分所有重要的实体,确定实体属性,然后找出它们之间的关系,最终结果是建立一个实体关系模型。 A top-down data modeling method. It identifies all key entities, determines entity attributes, finds out their relationships, and finally establishes an entity-relationship model
.
探索分析 Exploratory ***ysis   实现对数据的可视化探索分析,在探索分析过程中提供联想式、向导式的智能化分析能力。 Supports exploratory data ***ysis in a visualized way, and provides the intelligent ***ysis capability in the association and wizard mode during exploratory ***ysis. 
Flume Flume   Flume是一个高可用的,高可靠的,分布式的海量日志采集、聚合和传输的系统,Flume支持在日志系统中定制各类数据发送方,用于收集数据;同时,Flume提供对数据进行简单处理,并写到各种数据接受方(可定制)的能力。 A highly-available, highly-reliable, and distributed system that collects, aggregates, and transmits large number of logs. The Flume supports customization of various data senders in the log system to collect data. In addition, the Flume can perform simple data processing and write data to various receivers (customizable).
分类算法 Classification algorithm   单一的分类方法主要包括:决策树、贝叶斯、人工神经网络、K-近邻、支持向量机和基于关联规则的分类等;另外还有用于组合单一分类方法的集成学习算法,如Bagging和Boosting等。 Individual classification methods include decision trees, the Bayes, artificial neural network, K neighbor, support vector machine, and classification based on association rules. Learning algorithms that integrate individual classification methods including the Bagging and Boosting are also supported. 
地理标记语言 Geography markup language GML GML是可扩展标记语言(标准通用标记语言的子集)在地理空间信息领域的应用。利用GML可以存储和发布各种特征的地理信息,并控制地理信息在Web浏览器中的显示。 Application of the extensible tag language (standard common tag language sub-set) in the geographic space information domain. The GML stores and releases various geographic information, and controls the display of geographic information in a web browser. 
Hama Hama   是一个基于HDFS的BSP(Bulk Synchronous Parallel)并行计算框架,Hama可用于包括图、矩阵和网络算法在内的大规模、大数据计算。 A bulk synchronous parallel (BSP) computing framework based on the HDFS. The Hama can be used for large-scale big data computing including diagram, matrix, and network algorithms. 
HCatalog HCatalog   HCatalog是建立在Hive元数据之上的一个表信息管理层,吸收了Hive的DDL命令。为MapReduce提供读写接口,提供Hive命令行接口来进行数据定义和元数据查询。 A table information management layer based on the Hive metadata. The HCatalog absorbs the DDL commands of the Hive and provides read and write interfaces for the MapReduce and Hive command line interface (CLI) for defining data and querying metadata. 
Hmaster Hmaster   HBase的管理进程,作用如下:
1、为Region server分配region
2、负责Region server的负载均衡
3、发现失效的Region server并重新分配其上的region
4、HDFS上的垃圾文件回收
5、处理schema更新请求
Management process of the HBase with the following functions:
1. Assign regions for the Region server.
2. Implement load balancing for the Region server.
3. Identify invalid Region servers and re-assign their regions.
4. Recycle junk files in the HDFS.
5. Handle schema update requests. 
HQL Hibernate query language HQL Hive数据仓库的标准数据查询语言。 Standard data query language of the Hive warehouse. 
HRegionServer HRegionServer   HBase实际负责数据存储的进程 Process that implements data storage in the HBase. 
Impala Impala   对存储在HDFS、HBase的数据提供直接查询互动的SQL。 Provides SQL statements for querying data in the HDFS and HBase. 
互联网行为感知 Internet behavior awareness   为运营商提供了敏捷感知用户上网行为的能力,配合客户洞察组件的知识库模块,能够识别用户音乐类、游戏类、新闻类、综合门户类、阅读类、视频类的行为感知,基于用户互联网使用对象和行为触发动态营销策略,为用户提供及时精准的营销服务。 Enables carriers to be agilely aware of users' Internet access behavior, work in with Corpus module, recognize web pages that users access like video, music, news, portal, game, read, and trigger dynamic marketing policies based on the Internet access objects and behavior, providing instant and precise marketing services.
交互式自助分析 Interactive self-***ysis ISA ISA是一个专业的数据分析和展现平台,提供了静态报表、仪表盘、OLAP分析、自助取数、指标库、自助分析等整套分析解决方案。 A professional data ***ysis and display platform that provides functions such as the static report, dashboard, OLAP, self-service data obtaining, KPI library, and self-service ***ysis.
Intelligent Search Intelligent Search ISearch 互联网爬虫模块。Intelligent Search产品为解决方案提供提供融合系统内外各种数据源以及聚合互联网海量数据源的搜索能力。 Internet crawler. It provides the capability of searching various data sources inside and outside the Universe ***ytics Platform and massive Internet data sources.
JobHistoryServer JobHistoryServer   当job执行完成后,其运行信息会存储至JobHistroyServer当中。用户可通过JobHistroyServer获知job的运行是否成功。 After a job is executed, the job execution information is saved in the JobHistroyServer. Users can check whether a job is successfully executed using the JobHistroyServer. 
Kafka Kafka   Kafka是一个分布式的、分区的、多副本的消息发布-订阅系统,它提供了类似于JMS的特性,但在设计上完全不同,它具有消息持久化、高吞吐、分布式、多客户端支持、实时等特性,适用于离线和在线的消息消费,如常规的消息收集、网站活性跟踪、聚合统计系统运营数据(监控数据)、日志收集等大量数据的互联网服务的数据收集场景。 A distributed, partitioned, and replicated message releasing and subscription system. It provides features similar to the Java Message Service (JMS), but the design is different. The Kafka provides features, such as message persistence, high throughput, multi-client support, and real-time processing, and applies to online and offline message consumption. The Kafka applies to Internet service massive data collection scenarios, such as conventional data ingestion, active website tracking, aggregation of operation data in statistics systems (monitoring data), and log collection. 
Kerberos Kerberos   Kerberos 是一种网络认证协议,其设计目标是通过密钥系统为客户机 / 服务器应用程序提供强大的认证服务。在Hadoop里面用于支撑多租户的实现。 A network authentication protocol. It provides security authentication for access between the client and the server through a key system. It enables multi-tenant access in the Hadoop. 
键值数据库 Key-value database   键值数据库是一种NoSQL(非关系型数据库)模型,其数据按照键值对的形式进行组织、索引和存储。 KV存储非常适合不涉及过多数据关系业务关系的业务数据,同时能有效减少读写磁盘的次数,比SQL数据库存储拥有更好的读写性能。 A Not Only SQL (NoSQL) database model in which data is organized, accessed, and stored by key-value pairs. The key-value database is ideal for service data that involves limited data or service relationships. It effectively reduces the number of disk read and write operations and has better read and write performance than SQL databases. 
客户画像 Customer Profile   客户画像是客户信息化的标签,是包含客户属性及属性值的统一视图。 Customer information tag that provides a unified view of customer attributes and attribute values.
客户群 Customer segment   客户群是指具有相同行为或特征的客户群体。 A group of customers with same characteristics and behavior. 
控制流 Control flow   用于对任务的编排,用于控制任务之间的执行流程。控制流各个任务之间没有数据流向。 Arranges tasks and controls the task execution flow. Data is not transmitted between tasks in a control flow.
数据流 Data flow   用于对数据集的处理流程,主要负责对数据集进行抽取、转换和加载。 Data flows are used to process data sets and extract data.
位置变更行为感知 Location change awareness   为运营商提供了敏捷感知用户位置变更轨迹的能力,能够及时识别用户进入、离开或逗留某区域,基于位置关联和位置人群关联触发动态营销策略,为用户提供及时精准的营销服务。 Enables carriers to be agilely aware of users' location change trajectory, recognizes the events that users enter, stay at, or leave a region, and then triggers dynamic marketing policies based on the location associations, and the association between locations and users, providing instant and precise marketing services.
流计算 Spark Streaming   实时分析处理运动中的数据。通过描述和预测分析支持实时决策。流计算可捕捉和分析所有的数据并不受时间影响。 ***yzes and Processes active data in real-time and supports real-time decision making by description and predictive ***ysis. The Spark Streaming can capture and ***yze all data without being affected by time. 
Mahout Mahout   是基于Hadoop的机器学习和数据挖掘的一个分布式框架。Mahout用MapReduce实现了部分数据挖掘算法,解决了并行挖掘的问题。 A distributed framework based on Hadoop machine learning and data mining. The Mahout implements some data mining algorithms for concurrent mining using the MapReduce. 
平均差 mean deviation   单项测定值与平均值的偏差之和,除以测定次数。 Sum of deviations between the test value and mean value divided by the test count.
众数 mode   一组数据中出现次数最多的数值。 Value with the highest occurrence frequency in a data set. 
MongoDB MongoDB   一种开源的非关系型数据库(NoSQL database)多维数据库(Multi-Dimensional Databases)——用于优化数据联机分析处理(OLAP)程序,优化数据仓库的一种数据库 An open-source NoSQL and multidimensional database for optimizing data online ***ytical processing (OLAP) programs and the data warehouse. 
智能接触控制 Multi-channel Engagement   数字化营销的模块之一。提供客户接触全覆盖能力,支持运营商内部渠道和互联网社交渠道统一智能接触控制,有效提升客户接触频次和效率同时保障客户体验。 A module of the Universe Digital Marketing. The Multi-channel Engagement subsystem provides the customer contact omni-coverage capability and performs unified intelligent contact control for carriers' internal channels and Internet social channels, effectively improving the customer contact frequency and efficiency, and ensuring customer experience.
多维分析 Multidimensional Online ***ytical Processing MOLAP 多维分析是借助于OLAP分析方法,通过将多个维度与量度的任意交叉组合,同时运用钻取、旋转、切片等操作,帮助用户快速查找出问题原因。 Uses OLAP ***ysis methods to ***yze data based on changeable dimensions and measurements with the help of operations such as drill-down, pivot, and slicing to locate causes quickly.
模型管理器 Model manager   分析模型的运行态管理容器。模型管理 (Model Manager):分析模型的版本化管理,提升模型精准性。 Running status management container of ***ysis models. It provides version-based management for ***ysis models and improves model precision. 
NameNode NameNode   Hadoop分布式文件系统的主进程,运行在集群的主节点上。 NameNode维护文件系统树和树中所有文件和目录的元数据,并存储数据块与DataNode的对应关系。HDFS有两个核心NameNode(一个主节点),DataNode(多个从节点),DataNode主要是存储数据的,NameNode一是管理文件系统文件的元数据信息(包括文件名称、大小、位置、属性、创建时间、修改时间等等),二是维护文件到块的对应关系和块到节点的对应关系,三是维护用户对文件的操作信息(文件的增删改查),负责任务调度。 Main process of the Hadoop that runs on the master node of the Hadoop cluster. The NameNode maintains the file tree and metadata of all files and directories in the file tree, and stores mapping between data blocks and DataNodes. The HDFS consists of the NameNode (one master node) and DataNodes (multiple slave nodes). The DataNode stores data. The NameNode manages the metadata of file system files (including the file name, size, location, attribute, creation time, and medication time), maintains mapping between files and data blocks and between blocks and nodes, maintains information about users' operation on files (such as adding, deleting, modifying, and querying files), and implements task scheduling. 
网络服务质量 Network quality of service Network QoS 网络服务质量,通信系统或信道的常用性能指标之一。不同的系统及业务中其定义不尽相同,可能包括抖动、时延、丢包率、误码率、信噪比等。用来衡量一个传输系统的传输质量和服务有效性,评估服务商满足客户需求的能力。 A common performance KPI of a telecommunication system or channel. The definition of the network quality of service varies from system to system, and it may include the jitter, delay, packet loss ratio, bit error ratio, and signal-to-noise ratio. The network quality of service is used to measure the quality of a transmission system and the service effectiveness, as well as the capability of a service provider to meet the demands of users. 
NewSQL NewSQL   一个优雅的、定义良好的数据库系统,比SQL更易学习和使用,比NoSQL更晚提出的新型数据库。 A well-defined database. It is easier for users to learn and use than a SQL database and a new database proposed after NoSQL databases. 
NodeManager NodeManager   NodeManger是MapReduce当中在工作节点中的代理,用来监控节点中的资源状况(CPU、内存等),并定期上报至ResourceManger当中。 An agent of the MapReduce on work nodes. It monitors resource status on a node (such as the CPU and memory usage) and routinely reports the data to the ResourceManager. 
对象数据库 Object database   对象数据库与关系型数据库不同,信息是以对象的形式表示和存储。 An object database differs from a relational database in that information is expressed and stored by object. 
数据开放平台 Open Data Platform ODP 数据开放平台,提供了数据开放的能力,可以将分析处理后的数据开发成API的形式,通过HUAWEI AEP(API Enabler Platform)对外部进行开放。用户订购付费后,可以通过这些API访问对应的数据。 The Open Data Platform (ODP) provides the data openness capability. It can develop data generated after data processing and ***ysis into APIs and open the APIs externally on Huawei API Enabler Platform (AEP). After subscribing to APIs, users can use the APIs to access corresponding data. 
原始数据层 Original data store ODS Universe大数据平台融合数据模型的原始数据层。大数据平台具有非常复杂的数据来源,这些数据存放在不同的地理位置、不同的数据库、不同的应用之中,从这些业务系统对数据进行抽取并不是一件容易的事。因此,设立原始数据层用于存放从业务系统直接抽取出来的数据,这些数据从数据结构、数据之间的逻辑关系上都与业务系统基本保持一致,因此在抽取过程中极大降低了数据转化的复杂性,原始数据层主要关注数据抽取的接口、数据量大小、抽取方式方面的问题。 Original data store of the CDM on Huawei Universe ***ytics Platform. The data sources of the Universe are complex. The data sources are in different places, databases, and applications. It is not easy to extract data from these service systems. The Original Data Store stores data extracted from the service systems. During the extraction, the data structure and logical relationship between data remain unchanged, reducing the data transformation complexity. The Original Data Store focuses on the data extraction interfaces, data volume, and data extraction methods.
华为Hadoop集群的维护和管理部件 Operation Manager OM OM是一个基于WEB的监控和管理系统,实现对HDFS、MapReduce/YARN、HBase、Hive、Pig的web化操作和管理。 A web-based monitoring and management system that operates and manages the HDFS, MapReduce/YARN, HBase, Hive, and Pig. 
Oozie Oozie   Apache Oozie:是一个工作流引擎服务器,用于管理和协调运行在Hadoop平台上(HDFS、Pig和MapReduce)的任务。 The Apache Oozie is a workflow engine server that manages and schedules tasks in the Hadoop (including the HDFS, Pig, and MapReduce). 
Persona Persona   在客户洞察子系统中处理页面访问请求,实现标签和客户群的管理功能的网元。 A NE that processes page access requests and manages tags and customer segments in the Customer Insight subsystem. 
Persona Service Persona Service   在客户洞察中实现如实时标签和客户画像等高并发网络请求和消息队列的网元。 A NE that processes high-concurrent network requests (such as real-time tags and customer profiles) and message queues. 
个性化推荐 Personalized recommendation   设计营销活动时,基于用户历史数据,采用协同过滤算法、模型文件、脚本方式实现个性化自动推荐。 Personalized automatic recommendation can be implemented through collaborative filtering algorithms, model files, and scripts during marketing activity design. 
Pig Pig   Pig是一种编程语言,它简化了Hadoop常见的工作任务。Pig可加载数据、表达转换数据以及存储最终结果。Pig内置的操作使得半结构化数据变得有意义(如日志文件)。同时Pig可扩展使用Java中添加的自定义数据类型并支持数据转换。 A programming language that simplifies common Hadoop tasks. The Pig can load, express, transform, and store data. The built-in operations in the Pig can process semi-structured data (such as log files). The Pig can also use user-defined data types in the Java and transform data. 
分区键 Partition key PK 分区所依据的事件属性。 An event attribute used for partitioning. 
模式匹配引擎 Pattern Matching Engine PME 流处理内部模块。PME是复杂事件处理系统的核心部件,它通过监控整个运营商网络的事件,并根据自定义的模式匹配规则,实时地处理大量的基础事件,将其逻辑转化为具有意义的复合事件,再发送给业务系统或者消息中间件,最终实现整个运营商网络的智能监控和智慧运营。 A module of the Streaming. The Pattern Matching Engine (PME) is a core component in the complex event processing (CEP). The PME monitors events on carrier networks, processes a large number of events in real time based on user-defined pattern matching rules, converts the events to meaningful complex events, sends the complex events to service systems or messaging middleware, and finally achieves intelligent monitoring and smart operation of the entire carrier networks. 
策略中心 Policy Center PC 数字化营销的模块之一。Policy Center是离线分析能力和实时处理能力的汇集点,能引入各种离线数据如挖掘模型、客户标签、基站信息和互联网偏好等业务数据,辅助更精准的实时决策。 A module of the Universe Digital Marketing. The Policy Center provides the offline ***ysis capability and real-time processing capability. It can integrate various offline data, including mining models, customer tags, BTS information, and Internet preferences, helping make real-time decisions more accurate.
统一门户 Portal   统一门户是华为大数据Universe各个子系统的统一登录界面,通过门户提供统一的系统管理功能。 Provides a unified entry for logging in to Universe subsystems. The Universe ***ytics Platform provides the unified system management functions through the Portal.
集成预测 SmartMiner   Universe大数据分析平台集成预测模块支持处理覆盖B域(业务支撑)、O域(网络运营)、M域(企业管理)和互联网多种输入数据,内置20+算法,自动化实现算法选择,支持实现客户分类、挖掘客户潜在需求挖掘和数据预测分析。 Processes data from the business support system (BSS), operations support system (OSS), management support system (MSS) domains and the Internet. The module provides more than 20 built-in algorithms, implements automatic algorithm selection, and supports data mining and ***ysis capabilities such as customer segmentation, mining of customers' potential requirements, and data forecast.
推荐引擎 Recommendation engine   推荐引擎算法根据用户之前的购买行为或其他购买行为向用户推荐某种产品。 Recommends offerings to users based on user purchase behavior.
基础知识库 Referential Knowledge Repository RKR 提供终端库,网站库,分词库,为其他分析模块尤其是互联网分析使用。 Provides the terminal library, website library, and word segmentation library for other ***ysis modules such as the Internet ***ysis module. 
实时决策 Real-time decision RTD 在用户触发事件时,系统快速判断营销规则并及时作出决策。支持事件触发的实时营销。基于时机合成、策略配置、关联计算进行策略匹配和调度,帮助运营商从客户实时数据中获得更多价值。 When an event is triggered, the system quickly determines the marketing rules and timely makes decisions. This feature supports event-triggered real-time marketing, and performs policy matching and scheduling based on moment combination, policy configuration, and association calculation, enabling carriers to obtain more value from customers' real-time data. 
任务规则 Rule   在Streaming系统中,表示一种Spark Streaming计算任务规则,例如互联网营销计算任务。当输入确切的参数后,就可以提交为一个计算实例。 A rule for a Spark Streaming computing task (such as the Internet marketing computing task) in the Streaming. After parameters are specified, a computing instance can be submitted. 
Secondary NameNode Secondary NameNode   帮助元数据节点管理元数据的工具。用于镜像备份,或日志与镜像的定期合并。 Performs periodic checkpoints of the Namespace and helps keep the size of file containing log of HDFS modifications within certain limits at the NameNode.
Shark Shark   基于Spark的框架基础上提供的和Hive一样的SQL命令接口。 A SQL command interface similar to the Hive based on the Spark framework. 
Shuffle Shuffle   当map任务执行完成后,对map任务产生的中间数据的归并和排序并传输到reducer的过程称之为shuffle。 A procedure that merges and sorts intermediate data of a map task after the task is completed and transfers the result to the reducer.
Spark Spark   Spark是UC Berkeley AMP lab所开源的类Hadoop MapReduce的通用并行框架,拥有Hadoop MapReduce所具有的优点;但不同于MapReduce的是Job中间输出结果可以保存在内存中,从而不再需要读写HDFS,因此Spark能更好地适用于数据挖掘与机器学习等需要迭代的MapReduce的算法。 A common concurrent framework similar to the Hadoop MapReduce. It is an open-source framework provided by the UC Berkeley AMP lab and possesses the advantages of the Hadoop MapReduce. The Spark differs from the MapReduce in that intermediate job output results can be stored in the memory instead of the HDFS, making the Spark suitable for MapReduce algorithms that need iterations (such as data mining and machine learning). 
Spark Streaming Spark Streaming   构建在Spark上的流处理框架。 A streaming processing framework based on the Spark. 
Sqoop Sqoop   Sqoop是一款开源的工具,主要用于在Hadoop(Hive)与传统的数据库(mysql、postgresql...)间进行数据的传递,可以将一个关系型数据库(例如:MySQL、Oracle、Postgres等)中的数据导进到Hadoop的HDFS中,也可以将HDFS的数据导进到关系型数据库中。 An open-source tool for transferring data between the Hadoop (Hive) and traditional databases (such as MySQL and PostgreSQL). It can transfer data from a relational database (such as the MySQL, Oracle, and Postgres) to the HDFS of the Hadoop and transfer data conversely. 
自助分析 Self-service ***ytics   实现对数据的可视化探索分析,在探索分析过程中提供联想式、向导式的智能化分析能力。 Supports exploratory data ***ysis in a visualized way, and provides the intelligent ***ysis capability in the association and wizard mode during exploratory ***ysis.
实时流处理框架 Storm   Storm为分布式实时计算提供了一组通用原语,可被用于“流处理”之中,实时处理消息并更新数据库。这是管理队列及工作者集群的另一种方式。 Storm也可被用于“连续计算”(continuous computation),对数据流做连续查询,在计算时就将结果以流的形式输出给用户。它还可被用于“分布式RPC”,以并行的方式运行昂贵的运算。Storm可以方便地在一个计算机集群中编写与扩展复杂的实时计算,Storm用于实时处理,就好比 Hadoop 用于批处理。Storm保证每个消息都会得到处理,而且它很快——在一个小集群中,每秒可以处理数以百万计的消息。更棒的是你可以使用任意编程语言来做开发。 Provides a group of common primitives for distributed real-time calculation. It can be applied to stream processing and process messages in real time and updates databases as a new method for managing queues and worker clusters. The Storm can also be used in continuous computation to continuously query data streams and report results to users in stream format for computation. The Storm is applicable to the distributed RPC and implements complex calculation in a concurrent way. The Storm can create and extend complex real-time calculation in a computer cluster. The Storm can quickly process each message and can process millions of messages per second in a small cluster and support development using any coding languages. 
流式数据处理 Stream processing   Universe Streaming采用业界标准流处理框架,将底层流计算各个外部部件(Kafka、Flume、Spark Streaming)预集成,使用自研的Streaming Server进行任务调度,结合自研的高性能复杂事件处理引擎PME(Pattern Matching Engine),提供高效、实时、可灵活扩展的数据处理能力。为用户提供简单易用的实时事件处理平台。 The data ingestion subsystem provided by the Universe. It adopts the standard stream processing framework in the industry, pre-integrates the Flume, Kafka, and Spark, and works with Huawei-developed high-performance pattern matching engine to provide efficient, real-time, and flexibly scalable data processing capability and an easy-to-use and real-time event processing platform.
Streaming Server Streaming Server   统一流数据处理平台,Streaming中的管理模块。用于管理底层流部件,对底层流计算部件进行统一的任务调度。 A management module in the Streaming for unified streaming data processing. It manages bottom-layer stream processing components and performs unified task scheduling for stream computing components at the bottom layer. 
语音查询 Survey response   语音查询功能是一项更方便于手动输入的一种连接电子信息的方式,在生活中最常见的有:电话语音查询,手机语音拨号。其主要用途与开车和手脚不方便的群体。 An easier method for connecting to digital information than manually dialing a number. Common survey response methods include voice calls by phones and voice dialing by mobile phones. Survey response mainly serves for drivers and the customers who are disabled. 
详细上网记录 detailed charging record   用户详细的上网记录。在部分用户对计费有疑问时,运营商查出来跟用户核对的,避免用户对计费进行投诉。区别于CDR(CDR是指计费网关中的计费话单,提供给运营商,用于对用户收费)。 Detailed user Internet access record. It is used for verifying charging information and preventing user complaints about charging. The detailed charging record (DCR) is different from the CDR in that the CDR is used in the Charging Gateway for carriers to charge users. 
实时营销 real-time marketing   营销的一种类型,区别与周期性的批量营销。实时营销是指根据消费者当前的行为以及其历史数据,实时为其提供个性化的商品或服务。 A marketing type different from periodic batch marketing. Real-time marketing provides customers with personalized offerings or services based on their current consumption behavior and historical consumption data. 
事件 event   事件是指带业务语义的用户行为,例如在某类型位置(例如营业厅、商圈)停留,访问某一类型网站(例如视频类网站、音乐类网站)或使用某一类型App(例如社交类App或游戏类App)。
当外部系统获取客户行为后,将行为作为事件上报给营销系统,营销系统基于行为达到的条件判断是否进行营销。
A user action with service meaning, for example, visiting a website (for example, a video website or a music website) or using an app (for example, a social app or a game app) at a location (for example, store or business district).
An external system reports obtained customer behavior to the Campaign, and then the Campaign determines whether to launch marketing based on whether the behavior meets specified conditions. 
事件服务 Event service   事件服务,比如客户发生掉话事件,新开户等。 For example, call drop event and account registration. 
数据备份 data backup   将重要数据拷贝到备用存储区中的方法,用以防止原存储空间损坏或崩溃。 A method used to copy important data to the standby storage to prevent data loss in the case of damage or failure in the original storage. 
数据分析师 Datician   数据分析师指的是不同行业中,专门从事行业数据搜集、整理、分析,并依据数据做出行业研究、评估和预测的专业人员。 Professional personnel engaged in collecting, managing, and ***yzing data from various industries and conducting industry ***ysis, evaluation, and prediction based on data ***ysis results. 
标签商铺 Tag Store   标签商铺可以对标签的全生命周期进行有序管理,包括标签分类、标签创建、标签规则配置、版本管理、生命周期管理。 Provides a series of tag management operations, including tag management by category, tag creation, tag rule configuration, tag version management, and tag life cycle management. 
TeaStore TeaStore   TeaStore是一个华为自主研发的NoSQL数据库系统,和流行的NoSQL数据库一样,主要解决互联网业务特征的数据存储和检索。 Similar to mainstream NoSQL databases, the TeaStore, a NoSQL database developed by Huawei, provides a mechanism for storage and retrieval of Internet service data.
Token Token   PME的分区单位,按照一定的规则将数据划分为n个token,每个token都与一个执行引擎物理节点绑定。 Partition unit of the PME. Data is divided into N tokens based on the partitioning key, and each token is bound to a physical Engine node. 
图分析 Graph ***ytics   用于社交分析的基础工具算法。 A basic tool algorithm for social ***ysis. 
任务 Task   在Streaming系统中,表示可在Yarn运行的一个Spark Streaming计算实例,一般附有Task ID。Task可被提交、停止、查询。任务状态有正在提交、已提交、正在运行、正在停止、已停止等。 A Spark Streaming computing instance running in the YARN in the Streaming. A task has a task ID and can be submitted, stopped, and queried. Task status includes submitting, submitted, running, stopping, and stopped.
统一分析开发平台 Unified ***ytics Development Platform UADP 大数据分析平台的子系统。统一分析开发平台主要面向业务运营人员、数据科学家、第3方应用开发商3类角色,提供一站式开发工具。 A subsystem of the Universe ***ytics Platform. The Unified ***ytics Development Platform (UADP) provides a one-stop development tool for service operation personnel, data scientists, and third-party application developers.
统一分析开放平台 Unified ***ytics Openness Platform UAOP 大数据分析平台的子系统。统一分析开放平台面向运营商内外部用户和系统,提供基于API的数据服务开放和基于服务的平台能力开放。 A subsystem of the Universe ***ytics Platform. The UAOP provides API-based data services and service-based platform capability openness for external users and systems of carriers.
统一分析运行平台 Unified ***ytics Runtime Platform UARP 大数据分析平台的子系统。提供电信分析应用所需的丰富的、高性能的分析引擎,为统一分析开发平台、统一数据治理平台提供统一的运行环境和运维功能。 A subsystem of the Universe ***ytics Platform. The Unified ***ytics Runtime provides various and high-performance ***ysis engines for telecom application ***ysis, and provides a unified runtime environment and O&M function for the Unified ***ytics Development Platform and Unified Data Governance Platform.
统一数据访问 Unified Data Access UDA 数据治理套件的内部模块。华为统一数据访问平台,通过提供统一的数据访问接口构建透明数据访问能力,对上层屏蔽底层异构数据源处理实现细节,通过无差别的使用体验来降低云平台使用的难度,提升上层应用的开发效率。 Internal module of the Data Governance module. The Unified Data Access (UDA) provides unified data access interfaces for transparent data access and shields data processing details of heterogeneous data at the base layer from upper-layer applications, which reduces data access complexity though indistinctive data access experience and improves the development efficiency of upper-layer applications.
统一数据治理平台 Unified Data Governance Platform UDGP 大数据分析平台的子系统。统一数据治理平台为运营商提供全面高效的数据资产管控环境,包括统一的数据采集和整合,统一的安全、标准、生命周期和质量管理,以及多维度数据云图功能。 A subsystem of the Universe ***ytics Platform. The Unified Data Governance Platform (UDGP) provides a comprehensive and efficient data asset management and governance environment, implementing unified data ingestion and integration, unified management over security, standard, life cycle, and quality, and laying a multidimensional data landscape.
大数据分析平台 Universe ***ytics Platform   Universe大数据分析平台是华为依托行业经验,精心打造的大数据分析产品模块,汇聚、共享、运营电信领域专业化知识,支撑运营商数字化转型。 An elaborated Huawei Universe product ***ysis module that relies on industry experience. It aggregates, shares, and operates technical knowledge in the telecom domain, and supports digital transformation of carriers.
Universe Video ***ytics Universe Video ***ytics   Universe Video ***ytics为整个系统提供数据采集、数据分析能力,提供准实时的数据仓库服务。
基于SQM的PCF和Universe的BDI/Kafuka进行构建 提供视频的大数据采集(包括终端探针、服务器探针、独立探针),支持数据仓库和数据建模服务能力。
Provides data ingestion, data ***ysis, and quasi-real-time data warehouse services.
The Universe Video ***ytics (UVA) performs video big data ingestion (including terminal probe, server probe, and independent probe) based on the PCF of the SQM and the BDI/Kafka of the Universe ***ysis Platform, and supports data warehouse and data modeling service functions. 
统一调度 Unified Scheduling   华为Universe大数据分析平台统一调度模块提供可视化的工作流调度开发和编辑平台,方便用户通过拖拽的方式编辑任务执行逻辑,实现跨集群、跨系统任务的集中调度和管理。 Provides a platform for developing and editing work flow scheduling information in a visualized mode, which enables users to edit task logic in dragging mode and implement unified scheduling and management of cross-cluster and cross-system tasks.
数字化营销 Universe Digital Marketing   Universe Digital Marketing是华为依托行业经验,精心打造的精准、实时、敏捷的数字化营销平台,提供营销管理、营销策划、营销执行、营销评估、多渠道协同等端到端能力。 An accurate, real-time, and agile digital marketing platform created based on industry experience. It provides end-to-end capabilities such as marketing management, marketing planning, marketing execution, marketing evaluation, and multi-channel collaboration.
视频大数据辅助运营决策中心 Video Decision System VDS VDS为TV业务运营商提供业务运营辅助决策能力,包括用户画像、业务报表、推荐、搜索等。依托华为大数据分析平台进行构建,提供报表定制能力。 Assists TV service carriers in service operation decision making by providing customer profiles, service reports, and recommendation and search functions. It is based on Huawei Universe ***ytics Platform and supports report customization.  
Vertical Search Vertical Search VSearch 垂直搜索引擎是与通用搜索引擎截然不同的引擎类型。垂直搜索引擎专注具体、深入的纵向服务,致力于某一特定领域内信息的全面和内容的深入,这个领域外的闲杂信息不收录 An engine type different from general search engines. The Vertical Search focuses on detailed and in-depth vertical search for comprehensive content in a specified domain. Other information unrelated to the domain is not included. 
资源知识库 Resource knowledge base   资源类知识库通过搜集、更新领域规则、对现实世界进行建模,抽象常识概念和领域概念,获得实体、实体属性和实体间关系,可用于对运营商管道数据、电信领域数据、时间和位置、客户行为等进行语义化标注,并以此为基础做进一步分析。 By collecting and updating domain rules, resource knowledge bases model the real word, abstract common concepts and domain concepts, and obtain relationships between entities and entities and attributes. The relationships are used to perform semantic marking on carrier channel data, telecom domain data, time and location data, and customer behavior, which is the basis for further ***ysis. 
Web Service Web service   Web Service是一个平台独立的、低耦合的、自包含的、基于可编程的web应用程序,可使用开放的XML标准来描述、发布、发现、协调和配置这些应用程序,用于开发分布式的互操作的应用程序。 A programmable web application that is platform-independent, loosely coupled, and self-contained. Web applications can be described, released, discovered, coordinated, and configured by XML. Web services are used to develop distributed interactive applications. 
文本分析 Text ***ysis   文本分析是指对文本的表示及其特征项的选取;文本分析是文本挖掘、信息检索的一个基本问题,它把从文本中抽取出的特征词进行量化来表示文本信息。将它们从一个无结构的原始文本转化为结构化的计算机可以识别处理的信息,即对文本进行科学的抽象,建立它的数学模型,用以描述和代替文本。使计算机能够通过对这种模型的计算和操作来实现对文本的识别。由于文本是非结构化的数据,要想从大量的文本中挖掘有用的信息就必须首先将文本转化为可处理的结构化形式。 A method to express texts and extract text characteristics. Based on text mining and information search, text ***ysis quantifies the characteristic words extracted from texts to present text information and converts the original unstructured text into structured information that can be processed by computers. Text ***ysis scientifically abstracts texts and builds up mathematical models to describe and substitute texts. In this way, computers can identify texts through the calculation and operation on the models. Texts are non-structured data. To mine useful information from large number of texts, you need to transform texts into structured forms that can be processed by computers. 
销售品 Offering   运营商以营销为目的,按照一定的市场策略,对产品定价、包装后形成的可直接提供给客户选择的销售单元。 A sales unit generated after products are priced and packaged based on a certain marketing policy. 
信令采集 Signaling ingestion   实现MAP信令的监测,采集或拦截,将漫游用户信息存储于数据库,并提供给相关的业务进行处理。 Monitors, ingests, and intercepts MAP signaling, and stores roaming user information in databases for other services to process. 
星型模型 Star model   星型架构是一种非正规化的结构,多维数据集的每一个维度都直接与事实表相连接。 A model of an unnormalized structure, in which each dimension in the multidimensional data set is connected to a fact table.
虚拟化 Virtualization   虚拟化,是指通过虚拟化技术将一台计算机虚拟为多台逻辑计算机。在一台计算机上同时运行多个逻辑计算机,每个逻辑计算机可运行不同的操作系统,并且应用程序都可以在相互独立的空间内运行而互不影响,从而显著提高计算机的工作效率。 A technology that virtualizes a physical device into multiple independent logical devices. A computer can simulate multiple logical devices. Each logical device can run an operating system and have different applications deployed independently, significantly improving device operating efficiency. 
雪花模式 Snowflake mode   指一种扩展的星形图。星形图通常生成一个两层结构,即只有维度和指标,雪花图生成了附加层。实际数据仓库系统建设过程中,通常只扩展三层:维度(维度实体)、指标(指标实体)和相关的描述数据(类目细节实体)。超过三层的雪花图模型在数据仓库系统中应该避免。因为它们开始更倾向于支持OLTP的应用程序的规格化结构,而不是为数据仓库和OLAP应用程序而优化的非格式化结构。 An extended star chart. A star chart generates two layers (dimension layer and KPI layer) and a snowflake chart can generate an additional layer besides the two layers. A data warehouse generally extends to three layers, that is, the dimension, KPI, and relevant description data. A data warehouse generally has a snowflake chart model with no more than three layers. A snowflake chart model with more than three layers is more suitable for formatted structures of the OLTP applications than unformatted structures optimized for data warehouse and the OLAP applications. 
雪花模型 Snowflake model   雪花模型是对星型模型的扩展。它对星型模型的维表进一步层次化,原有的各维表可能被扩展为小的事实表。 An extension of the star model. The snowflake model further refines the levels of a dimension table in the star model. Original dimension tables are extended into smaller fact tables.
Hadoop资源管理框架 Yet Another Resource Negotiator Yarn YARN是Hadoop 2.0中的资源管理系统,它是一个通用的资源模块,可以为各类应用程序进行资源管理和调度。YARN不仅仅局限于MapReduce一种框架使用,也可以供其他框架使用,比如Tez、Spark、Storm等。 Resource management system in Hadoop 2.0. It is a general resource management module that manages and schedules resources for applications. The YARN can be used not only in the MapReduce framework, but also in other frameworks such as Tez, Spark, and Storm.
语音分析 Voice ***ytics   语音分析把语音文件转成文本。可以做后续文本分析。 Transforms voice files into texts for text ***ysis. 
消息序列 Zero Message Queue ZMQ 类似于Socket的一系列接口,其与Socket的区别是:普通的socket是端到端的(1:1的关系),而ZMQ却是可以N:M 的关系。ZMQ用于node与node间的通信,node可以是主机或者进程。 A series of interfaces similar to the socket. The ZMQ differs from the socket in that a socket works in 1:1 mode and the ZMQ in N:M mode. The ZMQ is used for communication between hosts or processes. 
对照组 Control group   用于选择实验组客户群,实验组客户群不进行营销,在营销活动结束后用于比对营销效果。 No marketing is conducted for users in a control group. The control group is used for marketing effect comparison after a marketing activity. 
AB测试组 A/B test group   可按百分比或数量对已选客户群进行随机分流,对不同分支的客户进行不同的营销,在营销活动结束后用于比对营销效果。 Customer segments are classified into groups by percentage or quantity. Then different marketing activities can be launched for customers in different groups. The marketing effect of these groups is compared after marketing. 
全局对照组 Global control group   在一段时间内,在所有活动中,对照组内的用户不进行任何营销。 Within a period of time, no marketing is conducted for users in the global control group in any activity. 
活动级对照组 Activity-level control group   影响活动内的战役,对照组内的用户不进行任何营销;其他活动不受此对照组影响。 For an activity configured with an activity-level control group, no marketing is conducted for users in the control group in all campaigns of this activity. The control group is invalid for other activities. 
战役级对照组 Campaign-level control group   影响战役内的客户群,对照组内的用户不进行任何营销;其他战役不受此对照组影响。 For a campaign configured with a campaign-level control group, no marketing is conducted for users in the control group. The control group is invalid for other campaigns. 
时机 Moment   触发营销活动的时机,是事件营销的必备属性。
时机是营销领域的概念,由Campaign Management子系统来定义,并在CAE里实现。
时机在物理上可以是以下任何一种Event:
1、CAE采集的原始消息,比如外部营销系统直接发过来的“首次使用SIM卡”webservice消息,该消息将直接作为时机、触发营销规则执行;
2、CAE识别(计算)出来的一种用户行为,比如“用户驻留在某地**分钟”、“用户当月消费流量达到**MB”;用户行为输出为Event,通过Kafka Topic被营销系统订阅。
3、若干个CAE Event的组合,比如“用户当月消费流量达到**MB”和“用户最近x天使用某app消费流量达到y MB”这两个Event求与、或,或其他关系运算得出。
Moments trigger marketing activities and are mandatory attributes for event marketing.
The moment is a concept in the marketing domain. Moments are defined in the Campaign Management and implemented in the CAE.
Physically, any of the following events can be moments:
Original messages collected by the CAE, for example, the "Initial SIM Card Usage" WebService message directly sent by an external marketing system. This message can function as a moment to trigger marketing rules. 
User behavior recognized or calculated by the CAE, for example, User Staying at a Place for XX Minutes, and User Traffic Usage Reaching XX MB in the Current Month. User behavior is generated as events and the events can be subscribed by the marketing system through Kafka topics.

A combination of multiple CAE events, for example, the combination of User Traffic Usage Reaching XX MB in the Current Month and Traffic Usage Reaching Y MB by Using an App in Recent X Days in AND, OR, or another relationship. 
社交营销 Social marketing   社交营销面向应用市场人员提供社交账号管理、社交发布内容管理、社交监听、社交分析的功能。 Social marketing provides the social media account management, content management, listening, and ***ysis functions for application marketing personnel. 
社交监听 Social media listening   市场人员可以设置监听的主题关键词,监听的社交媒体,监听的语言和地域,以了解监听的关键词在不同社交媒体上的分布情况和内容。 Marketing personnel can configure the listening keywords, social media to be listened, language, and area to understand the distribution and content of the listening keywords in different social media. 
社交分析 Social ***ysis   市场管理人员可以查看已经发布的内容在社交媒体上的传播效果,对人们的评论情况分析,用来改进后续的营销。 Marketing management personnel can check the effect of content published on the social media and ***yze people's comments to optimize subsequent marketing. 
社交发布 Social publishing   市场人员可以设置需要在不同社交媒体上发布的内容和计划发布时间,也可以通过和Campaign系统集成,设置营销内容在社交渠道的发布。 Marketing personnel can configure content to be published on different social media and the planned publishing time. They can also configure the publishing of marketing content through the social channel in the Campaign. 
属性 Attribute   客户洞察中,对于分析主体(客户)基础特征的描述。 In the Customer Insight, attributes describe basic features of ***ysis objects (customers). 
标签 Tag   为完成特定的业务目标,基于具体的应用场景需求,为具有相同特征的客户群体赋予的易识别的标识,是显著区别于其他群体的标识该群体的特征。 To achieve specific service objectives, a tag is used to identify customers with same features based on the application scenario. The tag of a customer segment differentiates the customer segment from other customer segments. 
领域知识 Domain knowledge   支撑分析主体(客户)洞察过程中参考的,与领域相关的各类常识、事实、规则和专家经验。 Domain knowledge facilitates ***ysis object (customer) insight. Domain knowledge includes domain-related common knowledge, facts, rules, and expert experience. 
客户群 Customer segment   基于某个业务目标、应用场景,采用一定技术手段,筛选、汇聚到一起的目标客户群体。 A customer segment contains a group of customers filtered and aggregated through a specified method based on a specific service objective and application scenario. 
客户画像 Customer profile   特定场景下客户描述客户的特征集合,即通过一组属性和标签组成观察客户的视角。 A customer profile is a feature set that describes a customer in a specific scenario. In other words, a customer profile is a perspective consisting of a group of attributes and tags for observing a customer. 
融合数据模型 Convergent data model CDM 电信行业领域模型,整合运营商的数据资产,把海量离散的、碎片化的数据加工形成具有商业价值的信息。 Convergent data models are models in the telecom industry. They integrate carriers' data assets and process massive scattered and fragmented data to obtain information that has commercial value. 
B域 BSS domain   业务域数据,例如CRM系统、Billing系统等。 Service domain, which includes systems such as the CRM and Billing. 
O域 OSS domain   网络域数据,例如信令系统、拨测系统、话务网管、数据网管、传输网管、网优系统、综合资源。 Network domain, which includes systems such as the signaling system, dialing test system, traffic network management system (NMS), data NMS, transmission NMS, network optimization system, and comprehensive resources. 
M域 MSS domain   企业管理域数据,例如财务系统、固定资产等。 Enterprise management domain, which includes the financial system and fixed assets. 
分析模型 ***ysis model   在数据模型的基础上,通过挖掘算法或者复杂规则计算出基础数据背后高价值的数据 Based on data models, ***ysis models calculate high-value data beyond basic data through mining algorithms or complex rules. 
谷歌分析追踪代码 Google ***ytics Tracking Code GATC 谷歌分析追踪代码可提供内容实验工具,该工具由Google ***ytics强力支持,可帮助应用开发人员优化应用。 Google ***ytics Tracking Code provides content experiment tools, helping application developers to optimize applications. 
推荐位 Recommendation slot   通常,Web页面或者Mobile App上展示个性化推荐商品都有固定的区域或者位置,这个位置就是商品推荐运营位,简称推荐位。推荐位唯一标识渠道上的一个推荐区域,每个推荐位由一个唯一的ID标识。客户系统的推荐请求必须携带推荐位ID。 Generally, personalized content recommendations are displayed in a fixed area or location on web pages for mobile apps. The fixed area or location is the content recommendation business slot (recommendation slot for short). A recommendation slot uniquely identifies a recommendation area in a channel. Each recommendation slot has a unique ID. A recommendation request from a customer system must contain a recommendation slot ID.
推荐策略 Recommendation policy   推荐策略也叫融合推荐策略。决定某个推荐位的一套完整的规则组合。一个推荐位可以设置多个生效的推荐策略,但同一时刻只有一个推荐策略生效。 A recommendation policy is also known as a convergent recommendation policy. A recommendation policy contains a set of complete rules that determine a recommendation slot. Multiple valid recommendation policies can be configured for one recommendation slot. However, only one recommendation policy takes effect at a time. 
策略融合 Policy-based convergence   把多个推荐算法(列表)和规则的推荐结果根据预制规则进行过滤、排列的过程。 Policy convergence refers to the process of filtering and sorting recommendation results generated by multiple recommendation algorithms or rules based on preconfigured rules. 
推荐算法 Recommendation algorithm   推荐算法的结果数据由推荐系统预先计算生成,计算过程使用到大数据分析挖掘平台,结果输出为一个商品推荐列表。 The result of a recommendation algorithm is calculated by the recommendation system in advance. During the calculation, the big data miming platform is used. The result is a content recommendation list. 
推荐规则 Recommendation rule   手工配置的规则,根据上下文规则给特定用户推荐特定榜单或特定商品。 Recommendation rules are manually configured to recommend rankings or commodities to specific users based on the context. 
服务开放 Service openness   提供大数据分析服务、资源服务、数据服务给租户使用。不同服务之间通过DigitalFoundry提供的服务绑定关联来实现关联关系。 Through service openness, the big data ***ysis service, resource service, and data service are provided for tenants. Services are associated through service binding association provided by the DigitalFoundry. 
服务创建 Service creation   服务创建是服务提供商根据平台规范,提供相应的服务描述文件、软件包以及设计包等,在DigitalFoundry平台定义出服务以供开发者进行订购使用。 Service creation indicates that a service provider provides the service description file, software package, and design package and defines services in the DigitalFoundry for developers to subscribe based on platform specifications. 
服务上架 Service shelving   服务上架指的是服务创建完成后,经过管理员的审批,使得服务目录能够呈现此服务,供开发者订购。 Service shelving indicates that the administrator approves a created service to make it displayed in the service catalog for developers to subscribe to. 
服务申请 Service application   服务申请指的是开发者订购服务,使用申请得到的服务实例进行应用开发。 Service application indicates that a developer subscribes to a service and uses the applied service instance to develop applications. 
服务绑定 Service binding   服务绑定指的是服务实例与应用或者是其他服务实例建立绑定关系,应用或者其他服务实例能够获取到被绑定服务的相关信息。 Service binding refers to binding between a service instance and an application or another service instance. Applications and service instances can obtain information about the bound service. 
服务去绑定 Service unbinding   服务去绑定指的是服务实例与应用或者是其他服务实例消除绑定关系,应用或者其他服务实例将不再能够使用此服务实例。 Service unbinding refers to unbinding between a service instance and an application or another service instance. Applications and service instances can no longer use the unbound service instance. 
服务扩缩容 Service scaling   服务扩缩容指的是根据业务量的大小,手工或者自动的扩大或者缩小服务实例能够承载的业务量。 Service scaling refers to manually or automatically increasing or decreasing the maximum volume that a service instance can bear. 
服务注销 Service deregistration   服务注销指的是开发者不再需要此服务时销毁申请的服务实例。 Service deregistration indicates that a developer destroys an applied service instance when the instance is not longer required. 
服务下架 Service suspension   服务下架指的是服务目录上不再呈现此服务,开发者不再能够订购此服务。 Service suspension indicates that a service is removed from the service catalog. After a service is suspended, developers cannot subscribe to it. 
服务删除 Service deletion   服务删除指的是从DigitalFoundry平台彻底删除服务的定义以及相关的软件包。 Service deletion indicates that the definition and related software packages are completely removed from the DigitalFoundry. 
弹性伸缩 Autoscaling   弹性伸缩是根据用户的业务需求和策略,自动调整其弹性计算资源的一种服务。以达到业务在运行高峰时期,无缝增加业务实例;以及在业务需求下降时期,自动减少业务实例,从而达到节约成本的目的。 Autoscaling is a service of automatically adjusting computing resources based on users' service requirements and policies. Autoscaling can seamlessly add service instances during peak hours and reduce service instances during off-peak hours, which helps reduce costs. 
门户内容管理 Portal content management PCM 提供公告、待办、收藏、多租户管理和数据源管理基本功能。 Portal content management provides basic functions including announcements, to-do tasks, favorites, multi-tenant management, and data source management. 
API提供者 API provider   API提供者指在API商城上发布API的用户,并通过销售API获得收益。 API提供者可以创建编辑API,并在商用环境中部署和销售。 An API provider is a user who releases APIs in the API store and gains profits by selling the APIs. An API provider can create and edit APIs and deploy and sell APIs in the commercial environment. 
API消费者 API customer   API消费者指通过API商城测试或购买API的用户,API消费者可以使用购买到的API来开发应用。 An API consumer is a user who tests or buys APIs in the API store. An API consumer can use purchased APIs to develop applications. 
下一个最佳购物建议 Next best offering NBO 个性化推荐相关的通用称呼。 Common concept related to personalized recommendation. 
下一步最佳行动 Next best action NBA 个性化服务相关的通用称呼。 Common concept related to personalized recommendation. 
水印 Watermark   数字水印过程就是向被保护的数字对象(如静止图像、视频、音频等)嵌入某些能证明版权归属或跟踪侵权行为的信息,可以是作者的序列号、公司标志、有意义的文本等等。
从视觉角度,分为可见水印和不可见水印。顾名思义,就是以嵌入水印后,能否被人以肉眼识别水印为依据划分的。
Watermarking indicates the process of inserting a piece of copyright information or information for tracking infringement to the digital object to be protected (for example, images, videos, and audio). The piece of information can be the author's social number, company logo, or a meaningful text.

Watermarks can be visible or invisible. 
假名化 Anonymization   为了限制通过个人数据来识别数据主体,个人数据中包含的身份信息可以被假名替代,这种替代就是假名化。假名化的两个属性是:(1)和假名相关联的其他属性不足以识别出这些属性关联的数据主体;(2)除假名分配者外,隐私相关方(例如数据控制者)在有限的努力下无法根据假名逆推出数据主体。假名化以后的数据依然属于个人数据。 To prevent data entity recognition by personal data, identity information in personal data can be anonymized. After anonymization, other attributes are sufficient to identify the associated data entity. Except the party who performs anonymization, the privacy party (for example, the data controller) needs to ensure that the data object cannot be deduced adversely based on the anonymization information. Anonymized data is also personal data. 
分析应用开发 ***ysis Studio AS ***ysis Studio是一个集取数、分析和应用开发为一体的开放性平台,采用了组件化、服务化的架构设计,通过简单的拖拽和配置,自助式完成对数据的提取、分析、探索和预测,分析结果能固化为应用,并与其他角色和用户间共享。 The ***ysis Studio is an open platform that provides the data obtaining, data ***ysis, and application development functions. The platform uses component-based and service-oriented architecture that allows users to implement functions by dragging and configuring elements. The ***ysis results can be released as applications and shared with other roles. 
交互分析报表 Interactive Self ***ytics Report ISAR 前身为V300R001C11中的ISA,是一个专业的数据分析和展现平台,提供了静态报表、仪表盘、OLAP分析、自助取数、指标库、自助分析等整套分析解决方案。 The Interactive Self ***ytics Report is called ISA in V300R001C11. It is a professional data ***ysis and display platform and integrates functions such as static reports, dashboard, OLAP, automatic data obtaining, KPI library, and self-service ***ysis.
血缘分析 Lineage ***ysis   血缘分析(又叫血统分析)是指从某一实体作为起点,往回追溯其数据处理过程,直到相关的数据源接口。为实现血缘分析功能,对于任何指定的实体,首先获得该实体的所有前驱实体,然后对这些前驱实体递归地获得各自的前驱实体,结束条件是所有实体到达数据源接口或者是实体没有相应的前驱实体。 Lineage ***ysis refers to the process of tracing the data processing (back to the related data source interface) of an entity. To implement lineage ***ysis, all preceding entities of an entity need to be obtained, then a recursive method is used to obtain all preceding entities of these entities. The recursive method ends when the data source interface is found or no preceding entities are found. 
影响分析 Impact ***ysis   影响分析是指从某一实体出发,寻找依赖该实体的实体。如果需要可以采用递归方式寻找所有实体。该功能支持当某些实体发生变化或者需要修改时,评估实体影响范围。 Impact ***ysis refers to the process of searching for entities depending on an entity. The recursive method can be used to search for all entities that depend on an entity. Impact ***ysis can be used to ***yze the impact of entity modification or changes. 
数据标准 Data standard   数据标准为数据模型的设计提供了规范和约束,为元数据和数据的质量保证提供了技术支撑。本期版本的数据标准主要包括:数据分层标准、模型设计标准(逻辑实体命名、字段命名、数据类型等)、模型库、业务术语等。 Data standards provide specifications and restrictions for data model design, helping ensure metadata and data quality. In this version, data standards include data tiering standards, model design standards (logical entity naming rules, field naming rules, and data types), model library, and service terminologies. 
数据质量 Data quality   通过计划、实施和控制活动,运用质量管理技术,度量、评估、改进和保证数据的恰当使用。数据质量管理的总体目标:标准化、体系化、自动化的全面数据质量管理,以达到数据质量控制的全面性、可控性、可度量性、可迅速定位和有效解决。 Data quality management uses quality management technologies to measure, evaluate, improve, and ensure the proper use of data through activity planning, execution, and control. The overall objective of data quality management is to make data quality control standardized, systematic, automatic, comprehensive, manageable, and measurable, enabling system users to quickly locate and rectify data quality issues. 
数据安全 Data security   数据安全关注数据治理过程中与数据相关的安全保障技术及相应的管理办法,包括:数据权限控制、数据去隐私化、数据加解密、数据访问审计等;保证数据可信、可用。 Data security focuses on security assurance technologies and management methods in the data governance process, including data permission control, data anonymization, data encryption and decryption, and data access audit, ensuring that data is trustable and usable. 
统一调度 Unified Scheduling   提供可视化的任务调度开发和编辑平台,方便用户通过“拖拽式”的方式编辑任务执行逻辑(触发、依赖、汇接、循环等),实现跨集群、跨系统任务的集中调度和管理 The Unified Scheduling provides a platform for developing and editing task scheduling in a visualized way, enabling system users to edit task logic in dragging mode and implementing unified scheduling and management of cross-cluster and cross-system tasks. 
预置事件 Preconfigured event   为减少用户配置事件的工作量,CAE出厂时自带了常用网络信令的事件定义,比如Gn口、http、ftp等,用户可直接使用。这些事件称为预置事件。 To reduce the event configuration work load, the CAE preconfigures common network signaling events, for example, Gn interface events, HTTP events, and FTP events. These events are preconfigured events and can be directly used. 
事件管理中心 Event management center   事件管理中心是用来浏览、编辑、新增、删除事件定义等事件管理的前台功能。 The event management center provides event management functions on the GUI, including event browsing, editing, creation, and deletion. 
规则 Rule   每一类行为感知模型称为一个规则,比如“用户驻留在某地**分钟”、“用户当月消费流量达到**MB”、“用户最近x天使用某app消费流量达到y MB”,等等,都称为一个rule。
每个rule在物理上是一个包含main class的jar,加上一个描述文件(元数据)组成。
CAE流处理IDE里用户编排出来的一个流程,也是一个rule。
A type of behavior awareness model is called a rule, for example, User Staying at a Place for XX Minutes, User Traffic Usage Reaching XX MB in the Current Month, and Traffic Usage Reaching Y MB by Using an App in Recent X Days.
Physically, each rule consists of a JAR package that has a main class and a description file (metadata).
A process orchestrated by a user in the stream processing IDE of the CAE is a rule. 
实例 Task   每个rule被赋予不同的参数并提交运行后,CAE内部都用一个task来表示其实例。比如对于“用户驻留在某地**分钟”这个rule,赋予参数“火车站,30”并提交运行后,生成一个task 001;被赋予参数“机场,10”并提交运行后,生成另一个task 002。以此类推。
rule和task的关系类似于Java里的类和对象。
每个task的输入源和输出目标,都是事件中心里的第二类Event,物理上对应Kafka的topic。
After parameters of a rule are set and the rule is submitted for execution, a task is created for the rule in the CAE. For example, if Train Station and 30 are transferred into the User Staying at a Place for XX Minutes rule and the rule is submitted for execution, task 001 is generated; if airport and 10 are transferred into the rule for execution, task 002 is generated.

The relationship between a rule and a task is similar to that between a Java class and an object.

The input source and output target of each task are the events of the second type in the event management center, which are Kafka topics physically.
事件 Event   CAE事件管理中心管理2类Event:原始信令/消息和Kafka topic。
其他的Event,如:Flume节点内部各个拦截器之间的Flume event、和Sparkstreaming job内各个Chain函数之间交换的event,不属于事件管理中心管理的Event。
The event management center of the CAE manages two types of events, original signaling/messages and Kafka topics.

Other events, for example, Flume events transmitted between Flume interceptors and events exchanged between the Chain functions of jobs in the Spark Streaming, are not managed by the event management center. 
媒体标签知识库 Media tag knowledge base   聚合互联网影视资源库(元数据+影评/评论),通过系统自动挖掘技术分析形成的标签知识库。
视频媒体标签知识库支持独立服务提供,与视频解决方案无缝集成,支持视频内容精细运营
The media tag knowledge base aggregates Internet movie and video resource libraries (including the metadata and comments on movies) and forms a tag knowledge base through automatic mining and ***ysis.

The media tag knowledge base is provided as an independent service. It is seamlessly integrated with the video solution and supports refined operation for video content. 
推荐列表 Recommendation list   由算法和推荐内容组成。推荐列表的结果数据由推荐系统预先计算生成,计算过程使用到大数据分析挖掘平台。 A recommendation list consists of a recommendation algorithm and content recommendations. The result of a recommendation list is calculated by the recommendation system in advance using the big data ***ysis and mining platform. 
A/B测试 A/B testing   同一个页面推荐位,不同用户分流采用不同推荐算法。方便运营过程中,实时对不同推荐模型进行有效评估和改进推荐命中率,让数据来指导产品决策 For different users in a recommendation slot, different recommendation algorithms can be used. This provides data support for decision making, helping efficiently evaluate recommendation models and improve the recommendation hit rate during operation. 
用户画像偏好推荐 User profile preference-based recommendation   基于用户历史行为和实时行为记录计算出的用户画像偏好标签(包括内容类别偏好、导演偏好、演员偏好),并根据用户画像偏好计算出匹配度TopN的内容进行推荐。 User profile preference tags, including preferences for content genres, actors, and directors, are calculated based on users' historical behavior and real-time behavior. Then top N content is calculated based on the user profile preferences for recommendations. 
内容相似特征推荐 Content feature similarity-based recommendation   基于内容的特征(包括:内容分类、导演、演员、内容标签、归属栏目),计算内容之间的相似度,选择相似度TopN的内容进行推荐。 Content similarities are calculated based on content features, including the genre, actor, director, content tag, and category and top N similar content records are recommended to users. 
协同过滤推荐 Collaborative filtering-based recommendation   基于用户-产品矩阵(UPM, User-Product Matrix),根据用户(User)对产品之前的评分(购买、使用),或其他用户对产品的评分(购买、使用)信息,根据最邻近用户使用的产品、或最邻近产品的使用用户,选择相似度TopN的产品/用户进行推荐 Based on the user-product matrix (UPM), top N similar products/users are selected for recommendation according to users' scores on purchased or used products, the products that the users lastly use, and last products that users use. 
内容自优化 Content self-optimization   根据用户反馈作为样本,优化和调整点击率预测模型,并输出更新后的模型和模型权重参数,并在推荐在线层提供的实时推荐接口计算用户个性化推荐列表时使用,使得推荐结果越来越匹配用户偏好 Based on users' feedback, the click rate prediction model is optimized. The updated model and model weight parameters are generated. This model is used when the real-time recommendation interface at the online layer calculates the personalized recommendation list that better matches users' preferences. 
用户画像标签知识库 User profile tag knowledge base   通过用户的观看行为数据、浏览行为数据、搜索行为数据等数据,构建用户基本资料、用户家庭特征、用户内容偏好、用户业务行为偏好、用户业务价值特征、用户在线身份特征等多个维度的画像标签体系。 The user profile tag knowledge base contains profile tag systems of multiple dimensions including user basic information, user family characteristics, content preference, service behavior preference, service value characteristics, and online identity characteristics based on users' content playback, browsing, and search behavior data.
长期偏好 Long-term preference   以一段时间(1个月)内为1个统计周期的用户行为记录进行统计,作为用户对某内容长期偏好,表示该用户长期稳定的偏好。 A user's long-term preference for specific content is calculated based on the user's behavior within a period (one month). A long-term preference is stable.
短期偏好 Short-term preference   以一段时间(前1天)内为1个统计周期的用户行为记录进行统计,作为用户对某内容短期偏好,表示该用户短期的偏好。 A user's short-term preference for specific content is calculated based on the user's behavior within a period (one day). 
实时偏好 Real-time preference   根据在线用户短时间窗口内的用户准实时行为记录进行统计,作为用户对某内容实时偏好,表示该用户实时在线偏好特征。 A user's real-time preference is calculated based on the user's quasi-real-time behavior in a short time window. 
资源使用策略 Resource usage policy   定义了用户访问UARP时的资源使用方式。
UARP定义的资源包括:CPU、内存、网络、服务优先级。支持自定义插件.。
A resource usage policy defines the resource usage method for a user to access the UARP.

Resources defined in the UARP include the CPU, memory, network, and service priority. User-defined plug-ins are supported. 
认证策略 User authentication policy   定义了用户访问UARP时的身份认证方式。UARP提供了的策略包括:黑白名单、用户名/密码。支持自定义插件。 A user authentication policy defines the method for authenticating users who access t he UARP. The UARP provides the following policies: blacklist, whitelist, and user name/password. User-defined plug-ins are supported. 
鉴权策略 Permission authentication policy   定义了用户服务调用时的权限验证方式。UARP提供的策略包括:黑白名单、发布/订阅。支持自定义插件。 A permission authentication policy defines the permission verification method used during users' service invoking. The UARP provides the following policies: blacklist, whitelist, and publishing/subscription. User-defined plug-ins are supported. 
服务调用 Service invoking   用户使用UARP提供服务的唯一方法。 Service invoking is the only method for users to use services provided by the UARP. 
单次call Simple call   用户只是简单的call一个单个的服务。跟“作业”相对。 A user simply calls a service. Simple call and job are relative concepts. 
作业 Job   UARP支持用户对多个服务进行编排,并制定编排结果的执行策略。编排好的结果通过UARP的作业服务提交。 The UARP allows system users to orchestrate multiple services and create execution policies based on the orchestration result. The orchestration result is submitted through the job service of the UARP. 
任务 Task   UARP作为一个运行时平台(也可以理解成任务平台),为了对用户发起的服务调用进行统一的、有序的、全方面(资源、优先级等)的管理,UARP会对这些服务调用根据用户的资源使用策略进行统一的运行时调度(比如:如何排队、如何分配服务实例等)。任务就是进行运行时调度的单位。一个会话只包含一个任务,任务也不能脱离会话存在。 The UARP is a runtime platform (also a task platform). To uniformly, orderly, and comprehensively manage users' service invokings, the UARP schedules these service invokings based on users' resource usage policies, for example, how to queue and how to assign service instances. A task is a service scheduling unit. A session contains only one task. 
会话 Session   如果把UARP在垂直方向切割成多个剖面,每个剖面都具备UARP的整个功能。那么剖面从静止到运行再到静止的整个过程就是会话。简单的说会话就是用户发起一次服务调用(或一次作业执行)的整个过程。 If the UARP is vertically cut into multiple parts, each part has all functions of the UARP. A session is the process that a part stays idle, runs, and becomes idle again. To be simple, a session is a service invoking by a user or a job execution. 
服务 Service   服务是实现功能的具体体现,具有统一的开发、运行和调用规范。UARP中的服务都是分布式的、集群化的。每个服务都有一个服务提供者。 Services are implementations of functions and have unified development, running, and invoking standards. Services in the UARP are distributed and clustered. Each service has a service provider. 
资源 Resource   资源包括以下几种:
1、硬件资源:cpu、内存、I/O等。
2、软件资源:服务。
3、其它:优先级。
资源的使用策略定义了如何分配和使用资源。如:可以使用多少硬件资源、分配几个服务实例、对服务使用的优先级等。
Resources are classified into the following types:
Hardware resources: include the CPU, memory, and I/O
Software resources: services
Others: priority
Resource usage policies define how to allocate and use resources, for example, number of hardware resources that can be used, number of service instances to be allocated, and service usage priorities. 
作业执行策略 Job scheduling policy   定义了作业何时被执行。UARP提供的策略包括:立即执行、周期执行。支持自定义插件。 A job scheduling policy defines when a job is executed. In the UARP, jobs can be immediately or periodically executed. User-defined plug-ins are supported. 
接入域 Access subsystem   接入子系统:负责管理接入的协议、连接、和部分鉴权的功能。 The access subsystem manages access protocols, connections, and some authentication functions.  
会话域 Session subsystem   会话子系统:UARP运行时的核心驱动模块,负责管理会话的整个生命周期、会话上下文、以及会话在各个状态下的动作驱动 Core drive module of the UARP, which manages the entire session life cycle, session context, and session action drive in each status. 
管理域 Management subsystem   配置管理子系统:UARP的配置中心,负责UARP内各个域的配置的统一存取、维护和展现 Configuration center of the UARP, which uniformly stores, maintains, and displays the configuration of each subsystem in the UARP. 
运维域 Operation & maintenance subsystem   运维子系统:UARP的运行维护中心,负责对UARP的运行状态提供统一的运维界面、告警接口和采集方式 Operation & maintenance (O&M) center of the UARP, which provides unified O&M pages, alarming interfaces, and ingestion methods for UARP running status monitoring. 
调度域 Scheduling subsystem   调度子系统:UARP的统一调度中心,负责对任务进行统一的调度管理,也应该支持和外部调度系统对接(如:USE),当和外部调度系统对接的时候调度域只充当代理的职责 Unified scheduling center of the UARP, which uniformly schedules and manages tasks. This subsystem can connect to an external scheduling system, for example, the USE. In this case, this subsystem functions as a proxy. 
任务域 Task subsystem   任务子系统:任务被调度域调度的单位,任务信息包括了用户、资源、描述等信息。所有对UARP的访问请求都会被转化成任务执行。任务域负责对任务及任务实例的统一管理。 Task scheduling unit of the scheduling subsystem. Task information includes user, resource, and description information. All access requests to the UARP are converted to tasks for execution. The task subsystem uniformly manages tasks and task instances. 
服务域 Service subsystem   服务子系统:UARP内所有的能力(对外的、对内的)均已服务的形式落地,服务域负责对所有的服务进行治理,并提供统一的服务开发、运行及访问形式。 All internal and external capabilities of the UARP are implemented as services. The service subsystem governs all services and provides unified service development, running, and access forms. 
资源域 Resource subsystem   资源子系统:UARP是支持多用户的运行平台,资源域提供了对用户的请求进行资源分配的能力。 Runtime platform of the UARP, which supports multiple users and provides the capability of allocating resources to user requests. 
集成域 Integration subsystem   集成子系统:UARP是个运行平台,会存在不同的计算和存储框架,集成域提供统一的技术对不同的计算和存储框架进行屏蔽,使服务开发人员不用关注具体的底层框架细节。 The UARP contains multiple computing and storage frameworks. The integration subsystem provides a unified technology to shield differences between these computing and storage frameworks and therefore service development personnel do not need to pay attention to details about bottom frameworks. 
消息处理部件 Message processing component   UARP中的通用技术部件,为UARP中的各个子系统之间的消息传递、存储、接收、和处理提供了统一的实现。 Common technical component in the UARP, which provides unified implementation of message sending, receiving, storage, and processing between subsystems in the UARP. 
事件处理部件 Event processing component   UARP中的通用技术部件,UARP是基于事件驱动的架构,事件处理部件为UARP中的各个子系统中对事件的发送、中转、接收和处理提供了统一的实现。 Common technical component in the UARP that is based on an event-triggered architecture. This subsystem provides unified implementation of message sending, receiving, relaying, and processing between subsystems in the UARP. 
缓存管理部件 Cache management component   UARP中的通用技术部件,UARP中的很多子系统都会使用缓存来保存状态和其它数据,缓存处理部件为UARP中的各个子系统中对缓存的创建、读写、生命周期管理提供了统一的实现。 Common technical component of the UARP. Some subsystems of the UARP store data including the status in the cache. This subsystem implements cache creation, reading, and writing, and life cycle management for other subsystems. 
跳转到指定楼层
快速回复 返回顶部