Field Formats
RAVENPACK FIELD VALUES
Observations are based on a sample from the initial 10 days, US Equities only, Normal trading hours (sample used contained 250-450k events) and are provided purely to give an indication of the type of values you can expect to see.
Average total events per day observed just over 100k.
Average total events observed during trading hours around 45k.
Hourly total events peaked at 8k per hour 08:00 - 10:00 and 16:00 - 17:00 EST.
NOTE: Blank entries are returned as None, Python evaluates None<0 to be True. So if you are checking for negatives, be sure to check for None as well.
For example, if you were checking EVENT_SENTIMENT_SCORE like this...
if field['EVENT_SENTIMENT_SCORE']<0:
Then you will get back all the negatives AND all the blanks (Nones).
Instead use..
if field['EVENT_SENTIMENT_SCORE']<0 and field['EVENT_SENTIMENT_SCORE']!=None :
This will return only the negatives.
TIMESTAMP_UTC
The Date/Time at which the news item was received by RavenPack
UTC YYYY-MM-DD hh:mm:ss.sss
HEADLINE
The headline or summary text for the document.
Maximum of 4000 characters.
EVENT_TEXT
A short text summary of an event captured in a document.
Maximum of 400 characters.
NIP – NEWS IMPACT PROJECTIONS
-1.00 to +1.00 (2 decimal places) based on degree of impact a news flash has on the market over the following two-hour period.
Scores below 0 indicate low or unknown impact and lower confidence in the score.
The best performance of the score is obtained when filtering for relevance above 0.90.
All events contain a value, majority are negative.
88% are < 0 , 2% are = 0 , 11% are > 0
In the 10 day sample period with over 300k events only 1 scored above 0.90.
-0.40 to -1.00 6.45%
-0.34 to -0.38 6.14%
-0.30 to -0.32 6.36%
-0.26 to -0.28 7.16%
-0.24 6.01%
-0.22 6.38%
-0.20 6.34%
-0.18 13.54%
-0.16 4.13%
-0.14 4.40%
-0.12 6.02%
-0.10 7.54%
-0.06 to 0.00 7.31%
0.00 to 0.10 6.68%
0.10 to 1.00 5.54%
RELEVANCE
0 to 100 (integer) how strongly related the mention of an entity is to the underlying news story, with higher values indicating greater relevance.
A score of 0 means the entity was passively mentioned.
A score of 100 means the entity was prominent in the news story.
Values above 75 are considered significantly relevant.
All events contain a value.
30% are above 75, 69% below 75, 1% are zero.
RELEVANCE Sample Pct
0 1%
1 5%
2 7%
3 8%
4 7%
5 4%
6-9 7%
10-19 8%
20-29 4%
30-39 10%
40-49 5%
50-59 3%
60-69 1%
70-79 1%
80-89 2%
90-98 8%
99 11%
100 10%
CSS – COMPOSITE SENTIMENT SCORE
-1.00 to +1.00 (2 decimal places) based on combining other RavenPack sentiment analysis techniques.
All events contain a value, 45% are zero, removing them we have...
81% > 0, 19% < 0.
EVENT_SENTIMENT_SCORE
-1.00 to +1.00 (2 decimal places) based on various proxies sampled from the news.
From the sample around 25% of events contained a score. Of those...
53% > 0, 28% = 0, 19% < 0.
EVENT_RELEVANCE
0 to 100 reflects the relevance of the event in the story per table below.
From the sample around 25% of events contained a score. Of those...
Postion | Event_Relevance | Sample Pct |
---|---|---|
First in Headline | 100 | 21.3% |
Subsequent in Headline | 90-99 | 0.6% |
Paragraphs 1 & 2 | 80-89 | 24.0% |
Rest of story body | 1-79 | 52.4% |
Zero | 0 | 1.7% |
Observations : the 24% in 80-89 is almost all at 80 (23.8%).
The formula seems to cluster scores at 5 and 10 marks.
This field is very good for ensuring the news article relates directly to the symbol.
PEQ – GLOBAL EQUITIES
Positive and Negative words and phrases in articles about global equities.
-1, 0, or 1 (negative, neutral, positive sentiment).
All events contain a value, 75% are zero, removing zero events we have..
82% > 0, 18% < 0
BEE – EARNINGS EVALUATIONS
News stories concerning earnings evaluations.
-1, 0,or +1 (negative, neutral, positive sentiment).
All events contain a value, 88% are zero, removing zero events we have..
74% > 0, 26% < 0
BMQ – EDITORIALS & COMMENTARY
News sentiment based on short commentary and editorials on global equity markets.
-1, 0, or +1 (negative, neutral, positive sentiment).
All events contain a value, 62% are zero, removing zero events we have..
79% > 0, 21% < 0
BAM – MERGERS & ACQUISITIONS
News stories about mergers, acquisitions, and takeovers.
-1, 0, or +1 (negative, neutral, positive sentiment).
Trained on stories that lead up to pre-identified mergers, acquisitions, and takeover events.
All events contain a value, 97% are zero, removing zero events we have..
69% > 0, 31% < 0
BCA – REPORTS ON CORPORATE ACTIONS
Corporate action announcements.
-1, 0, or +1 (negative, neutral, positive sentiment).
All events contain a value, 84% are zero, removing zero events we have..
76% > 0, 24% < 0
BER – EARNINGS RELEASES
-1, 0, or +1 (negative, neutral, positive sentiment)
All events contain a value, 83% are zero, removing zero events we have..
75% > 0, 25% < 0
MCQ – MULTI CLASSIFIER FOR EQUITIES
A score that represents the news sentiment based on the tone; applicable only towards the most relevant entities mentioned in a story.
MCQ scores can take values of -1, 0, or 1 indicating negative, neutral, or positive sentiment, respectively.
The score is derived from a combination of analytics values produced by the BMQ, BEE, BCA, and ANL-CHG classifiers.
An MCQ score can have a non-neutral value when the relevance score for an entity is 90 or higher and either there is an ANL-CHG score or all of BMQ, BEE, and BCA scores are positive (1) and neutral (0) or negative (-1) and neutral (0).
Entities with a relevance of less than 90 will always have a neutral score.
The logic behind this analytic is to detect consistent sentiment classifications, discarding combinations where these classifiers may have contradictory scores.
All events contain a value, 86% are zero, removing zero we have..
75% >0, 25% <0
FACT_LEVEL
An indicator of how close the event relates to a fact.
Maximum of 10 characters.
During sample 68% of events had no value. Breakdown of events with an entry:
Fact-Level | Description | Sample Pct |
---|---|---|
fact | 95% | An event that came from a concrete statement of information. |
forecast | 4% | An event that gives guidance about the future. |
opinion | 1% | An event that expresses a view or a hypothesis about a statement of information. |
GROUP
A collection of related events. The second highest level of the RavenPack Event Taxonomy.
Around 25% of events contain a group. Examples below..
Top 10 make up just over 90% of events with a value.
|Group|Sample Pct| | stock-prices |17%| | equity-actions |14%| | earnings |13%| | analyst-ratings |11%| | acquisitions-mergers |10%| | products-services |9%| | revenues |5%| | labor-issues |4%| | price-targets |4%| | partnerships |4%| | legal |2%|
Other group names include... 'investor-relations', 'marketing', 'assets', 'dividends', 'insider-trading', 'credit-ratings', 'credit', 'stock-picks', 'regulatory', 'security', 'corporate-responsibility', 'crime', 'government', 'bankruptcy', 'war-conflict', 'indexes', 'domestic-product', 'industrial-accidents', 'technical-analysis', 'transportation', 'balance-of-payments', 'exploration', 'employment', 'public-opinion', 'civil-unrest', ' taxes'
Maximum of 50 characters.
TYPE
A class of events, the constituents of which share similar characteristics.
Again, around 25% of events contain a type.
Top 10 constitute around 67% of the events with a type.
|TYPE|Sample Pct| | stock-price |17%| | ownership |11%| | earnings |7%| | analyst-ratings-change |7%| | stake |6%| | revenue |4%| | analyst-ratings-set |4%| | price-target |4%| | partnership |4%| | acquisition |3%|
Others type names include.. 'product-release', 'earnings-per-share-estimate', 'business-contract', 'executive-appointment', 'conference-call', 'conference', 'earnings-per-share', 'product-enhancement', 'executive-resignation', 'dividend', 'clinical-trials', 'legal-issues', 'unit-acquisition', 'facility', 'insider-sell', 'settlement', 'earnings-guidance', 'board-member-appointment', 'revenue-guidance', 'award', 'credit-rating-change', 'acquisition-bid', 'earnings-per-share-guidance', 'going-private', 'earnings-expectations', 'earnings-per-share-expectations', 'workforce-salary', 'product-development', 'investment', 'merger', 'executive-firing', 'product-pricing', 'public-offering', 'revenue-volume', 'stock-pick', 'buybacks', 'trading', 'sanctions', 'hirings', 'operating-earnings', 'expenses', 'earnings-estimate', 'product-review', 'note-sale', 'board-member-resignation', 'demand', 'supply'
Maximum of 50 characters.
SUB_TYPE
A subdivision of a particular class of events.
14% of events have a value, the top 10 from those were...
SUB-TYPE | Sample Pct |
---|---|
gain | 19% |
loss | 14% |
positive | 12% |
increase | 9% |
negative | 8% |
up | 7% |
neutral | 6% |
decrease | 5% |
upgrade | 5% |
downgrade | 2% |
Other sub-types observed in sample: 'above-expectations', 'down', 'set', 'completed', 'open', 'below-expectations', 'interest', 'approved', 'start', 'granted', 'close', 'buy', 'terminated', 'pricing', 'sale', 'affirmation', 'unchanged', 'complete', 'release', 'cut', 'approval', 'raise', 'halt', 'resumed', 'awarded', 'rumor', 'equity', 'listing', 'considered', 'dismissed', 'filed', 'stable', 'filing', 'failed', 'relocation', 'fears', 'sell', 'reduction', 'retired', 'charge', 'meet-expectations', 'discontinued', 'attack', 'scrutiny', 'suspended', 'confirmation', 'rejected', 'costs', 'withdrawn', 'mixed', 'pass', 'removed', 'extended', 'provisional-rating', 'revision', 'lifted', 'developing', 'no-bonus', 'shares-options', 'delisting', 'conditional'
Maximum of 50 characters.
EARNINGS_TYPE
For events about earnings (Net Income, EPS, EBITDA, etc.), this named attribute indicates the type of calculation method used for reporting.
0.17% of events have a value in this field.
Breakdown as follows:
Type | Description | Sample Pct |
---|---|---|
ex-exceptionals | 50% | Fully adjusted (non-GAAP) to exclude both extraordinary items and SOE. |
reported | 12% | Represents GAAP EPS for all U.S. companies. Includes exceptionals, nonrecurring items, and stock option expense(SOE). Since analysts often provide both adjusted and non-adjusted EPS figures, this distinguishes itself as the non-adjusted figure. |
adjusted | 11% | Represents non-GAAP EPS for all U.S. companies. Excludes exceptionals, nonrecurring items, and SOE. |
diluted-reported | 10% | Represents GAAP earnings "as reported", calculated by the analysts in accordance with the accounting standards by which the company abides. Based on diluted shares and includes all extraordinary and unusual items. |
non-gaap | 6% | Not according to GAAP. These values do not include non-recurring items and include SOE if it has been reported by the company. |
diluted-adjusted | 5% | Labelled in brokers’ research report as being adjusted for any nonrecurring, discontinued operations, and/or exceptional items. Based on the diluted shares. |
non-diluted | 3% | Accounts for all the P&L from operational, trading, and interest activities, that have been discontinued or acquired at any point during the year. Excludes any profit or loss associated with the sale or termination of discontinued operations, fixed assets or related businesses, or from any permanent devaluation or write off of their values. Does not factor in the dilutive effects on convertible securities. |
consolidated | 3% | Data for a given company is merged with all of its affiliates and is consolidated. |
headline-basic | 0% | Accounts for all the P&L from operational, trading, and interest activities that have been discontinued or acquired at any point during the year. Excludes any profitor loss associated with the sale or termination of discontinued operations, fixed assets or related businesses, or from any permanent devaluation or write-off of their values. Does not factor in the dilutive effects on convertible securities. |
headline-diluted | 0% | Accounts for all the P&L from operational, trading, and interest activities that have been discontinued or acquired at any point during the year. Excludes any profit or loss associated with the sale or termination of discontinued operations, fixed assets or related businesses, or from any permanent devaluation or write-off of their values. Calculated using fully diluted shares outstanding. |
standalone | 0% | Represents the data for the parent company. Also known as Non-Consolidated. |
Maximum of 50 characters.
EVALUATION_METHOD
A period of time used to measure changes from previous levels in an event.
Maximum of 50 characters.
Currently RavenPack supports the following evaluation methods as named attributes:
|Method |Description | |YOY |Year-on-Year change| |QOQ |Quarter-on-Quarter change| |MOM |Month-on-Month change |
RP_STORY_ID
An alphanumeric character identifier to uniquely identify each news story analyzed.
This value is unique across all records.
Example: 1FB2B3F5E99C4D3BCF59FDB3E8C8C9BD.
32 characters
RP_ENTITY_ID
A unique and permanent entity identifier assigned by RavenPack.
Every entity tracked is assigned a unique identifier comprised of 6 alphanumeric characters.
Example: 228D42.
6 characters
ENTITY_TYPE
The type of entity associated with a particular RP_ENTITY_ID.
As data relates to US Equities on CQ, this value is always COMP.
ENTITY_NAME
The official canonical name of the entity identified by the RP_ENTITY_ID.
Maximum of 400 characters.
COUNTRY_CODE
The two character ISO-3166 country code associated with an entity.
Companies and organizations are associated with the country of incorporation.
2 characters.
All observed entries contain US as CQ trades US Equities only.
EVENT_SIMILARITY_KEY
A unique 32 character key that identifies similar stories in the RavenPack Analytics data.
All similar stories across the entire archive and those arriving on the real-time feed share the same similarity key.
Maximum of 32 characters.
EVENT_SIMILARITY_DAYS
A granular number with up to 5 decimal places which indicates the number of days since a similar event was detected over the last 365 days.
Values range between 0.00000 and 365 inclusive.
A value of 365 means that the most recent similar story may have occurred 365 or more days in the past.
The value 0.00000 means a similar story occurred with the exact same timestamp.
TOPIC
A subject or theme of events detected by RavenPack.
The highest level of the RavenPack Event Taxonomy.
Maximum of 50 characters.
24% of events contained an entry
97.7% 'business', 2.2% 'society', 0.06% 'politics' and 0.03% 'economy'.
PROPERTY
A generic named attribute of an event such as an entity, role, or string extracted from a matched event type.
When applicable, the role played by the entity in the story is detected and tagged.
Maximum of 50 characters.
93% of sample had no entry. With those removed...
PROPERTY Sample Pct
held 33%
acquiree 21%
rater 18%
acquirer 13%
owner 5%
defendant 3%
participant 2%
investor 1%
supplier 1%
organizer 1%
Other Properties observed in sample : 'target', 'plaintiff', 'customer', 'summoned', 'recipient', 'disfavored', 'victim', 'oversold', 'favored', 'exporter', 'issuer', 'protester', 'competitor', 'protestee', 'authority', 'importer'
RP_POSITION_ID
A unique and permanent identifier for positions assigned by RavenPack. Every position tracked is assigned a unique entity identifier comprised of 6 alphanumeric characters. A full list of RP_POSITION_IDs is available via the Entity Mapping API.
POSITION_NAME
The position held by an individual within the entity involved in a specific news event. A full list of POSITION_NAMEs is available via the Entity Mapping API. Maximum of 400 characters.
MATURITY
For events related to debt,this named attribute indicates the period of time for which a financial instrument remains outstanding. Maximum of 50 characters.
The time period is represented by the following formats:
Format | Description |
---|---|
1-365-DAY | Maturity in days. Prefix is a number from 1 to 365, e.g. 2-DAY. |
1-52-WK | Maturity in weeks. Prefix is a number from 1 to 52, e.g. 4-WK. |
1-12-MTH | Maturity in months. Prefix is a number from 1 to 12, e.g. 9-MTH. |
1-50-YR | Maturity in years. Prefix is a number from 1 to 50, e.g. 40-YR. |
Only 2 entries in sample of 250k had a value.
EVENT_START_DATE_UTC
The date when the event starts in UTC. For example, in a story received at 2016-09-01 13:00:00 that says: “Microsoft will host a conference call on September 2nd”, the TIMESTAMP_UTC will be 2016-09-01 13:00:00.000 and the EVENT_START_DATE_UTC will be 2016-09-02 00:00:00.
EVENT_END_DATE_UTC
The date when the event ends in UTC. For example, in a story received at 2016-09-01 13:00:00 that says: “Microsoft will host a conference call on September 2nd”, the TIMESTAMP_UTC will be 2016-09-01 13:00:00.000 and the EVENT_END_DATE_UTC will be 2016-09-03 00:00:00.
REPORTING_PERIOD
A period on a financial calendar that acts as a basis for reporting business information.
Maximum of 50 characters.
|Format |Description| |---|---| |YYYY-Q{1-4} |Quarter-long regular calendar period, e.g. 2016-Q1 | |FY-YYYY-Q{1-4} |Quarter-long fiscal calendar period, e.g. FY-2016-Q1| |YYYY-H{1-2} |Half-year regular calendar period, e.g. 2016-H1 | |FY-YYYY-H{1-2} |Half-year fiscal calendar period, e.g. FY-2016-H1 | |YYYY-9MTH |9-Month regular calendar period, e.g. 2016-9MTH | |FY-YYYY-9MTH |9-Month fiscal calendar period, e.g. FY-2016-9MTH| |YYYY |Year-long regular calendar period e.g. 2016 | |FY-YYYY |Year-long corporate fiscal calendar period e.g. FY-2016 |
REPORTING_START_DATE_UTC
The start of the reporting period associated with the event. For example, in a story received at 2016-08-01 13:00:00 that says: “Employment Advances in Q2”, the TIMESTAMP_UTC will be 2016-08-01 13:00:00.000 and the REPORTING_START_DATE_UTC will be 2016-04-01 00:00:00.
REPORTING_END_DATE_UTC
The end of the reporting period associated with the event. For example, in a story received at 2016-08-01 13:00:00 that says: “Employment Advances in Q2”, the TIMESTAMP_UTC will be 2016-08-01 13:00:00.000 and the REPORTING_END_DATE_UTC will be 2016-07-01 00:00:00.
RELATED_ENTITY
An entity that relates directly to another entity within the context of the same event in a story. Within the story, the EVENT_SIMILARITY_KEY will be the same for both records. See the RELATIONSHIP field for an indicator of the type of relationship. Maximum of 6 characters
RELATIONSHIP
The type of relationship between entities reported in the RELATED_ENTITY field. Maximum of 50 characters. Values are:
Relationship | Description |
---|---|
PRODUCT | A product that is owned by a company. |
OWNER | The company that owns a product. |
CATEGORY
A unique tag to label, identify, and recognize a particular type and property of an entity-specific news event.
Maximum of 100 characters
93% of sample events had no entry
With those removed the top 10 returns which make up 67% of the remaining were...
CATEGORY | Sample Pct |
---|---|
stock-gain | 13% |
earnings | 10% |
stock-loss | 10% |
stake-acquiree | 7% |
ownership-held | 7% |
ownership-increase-held | 5% |
partnership | 5% |
analyst-ratings-change-negative | 4% |
ownership-decrease-held | 4% |
product-release | 3% |
There were 484 other Catagories observed in the sample.
NEWS_TYPE
Classifies the type of news story into one of the following categories (pct shows as seen in sample):
Type | Description | Sample Pct |
---|---|---|
FULL-ARTICLE | Headline and one or more paragraphs of mostly textual material. | 77.7% |
PRESS-RELEASE | Corporate announcement created by an entity and distributed by a news wire. | 17.0% |
NEWS-FLASH | Headline and no body text. | 2.7% |
TABULAR-MATERIAL | Headline and one or more segments of mostly tabular data. | 2.5% |
HOT-NEWS-FLASH | Headline and no body text, marked as breaking news in the editorial process. | 0.04% |
RNS-SEC8K | A news article that came from an SEC 8K filing. | 0.03% |
RNS-SEC10K | A news article that came from an SEC 10K filing. | 0.0% |
RNS-SEC10Q | A news article that came from an SEC 10Q filing. | 0.0% |
RNS-SEC13D | A news article that came from an SEC 13D filing. | 0.0004% |
RNS-SEC13F | A news article that came from an SEC 13F filing. | 0.0% |
RNS-SEC144 | A news article that came from an SEC 144 filing. | 0.0% |
Maximum of 50 characters.
PROVIDER_ID, RP_SOURCE_ID and SOURCE_NAME appear very similar though the 'DJ' provider ID contains a few different source id's and names.
PROVIDER_ID
Identifies the provider of the content.
Maximum of 50 characters.
Every event had a provider id broken down as follows:
PROVIDER_ID | Description | Sample Pct |
---|---|---|
MRVR | MoreOver News and Social Media | 81.2% |
DJ | Dow Jones Newswires and Third Party Content Wires | 11.7% |
BZG | Benzinga Pro | 5.4% |
FLY | The Fly | 1.0% |
MT | Midnight Trader News and Midnight Trader MtPro | 0.5% |
AN | Alliance News | 0.1% |
RP | RavenPack-originated content | 0.1% |
FXS | FX Street News and FX Street Economic Calendar | 0.01% |
SOURCE_NAME
The official canonical name for the news source.
Maximum of 400 characters. 85% of events had a source name.
Approx 1600 unique Source names in the sample.
Top 10 made up 44% of the sample.
SOURCE_NAME | Sample Pct |
---|---|
Bibey Post | 15% |
Dow Jones Newswires | 6% |
Benzinga | 5% |
Ticker Report | 4% |
Nasdaq | 3% |
OpenPR | 3% |
Business Insider | 2% |
Military Technologies News | 2% |
MSN | 2% |
Morningstar | 2% |
RP_SOURCE_ID
A unique and permanent news source identifier assigned by RavenPack.
Every news provider tracked is assigned a unique identifier
6 alphanumeric characters.
Every event had an RP_SOURCE_ID, around 1600 unique source ids.
ANL_CHG – ANALYST RECOMMENDATIONS & CHANGES
A score that represents a change in recommendation by an analyst firm in the form of a numerical score.
When the mention of a company in a story matches the criteria for ANL-CHG, scores can take values of -1, 0, or +1, indicating a downgrade, neutral, or upgrade rating.
99% of events are zero. With the zeroes removed...
63% are -1, 37% are 1
RP_STORY_EVENT_INDEX
Represents the order in which entity records are presented by RavenPack per news story. This integer can be equal to or less than the RP_STORY_EVENT_COUNT.
RP_STORY_EVENT_COUNT
Represents the total entity records published by RavenPack per news story.
PROVIDER_STORY_ID
An alphanumeric character identifier to uniquely identify the news story in the provider’s universe.
Maximum of 400 characters.
Here is a key to help interpret these identifiers:
Provider | Format | Example |
---|---|---|
AN | 1420095499035228600 | |
BZG | + ":" + 123123:1232323 | |
DJ | Product + Docdate + Seq DN20010228008585 | |
FLY | 2122561 | |
FXS News | NEWS + ":" + FXS_storyId NEWS:bde3b58a-88d2-4a03-8c4e-98d4e9da28a | |
FXS EcoCal | ECO + ":" + idEcoCalendarDate | |
ECO | :21d28292-d250-4b94-b5ad-e88cfb02ea62 | |
MRVR | FeedVersion + ":" + ArticleId 10:16481270273 | |
MT | TransmissionID A754249 | |
MT | Pro Filename form_01152014_2377436.xml | |
RP | 1FB2B3F5E99C4D3BCF59FDB3E8C8C9BD |
PRODUCT_KEY
Identifies which content set the record came from. For RavenPack Analytics,the product key is always "RPA". 3 characters.