Twitter Activity provides loads of information for analysis. You can learn your target audience better, determine the most popular content, check the right time when your visitors are more active, etc. By getting the Twitter Activity data and putting it into the advanced analytical systems, you can correlate that data with other essential information from different sources. The prospects are tremendous.
In this article, we’ll explore the case of importing the Twitter data with the help of SQL Developer Web into an Autonomous Database. The imported Twitter dataset is in JSON format. Thus, you’ll need to work on how to import JSON data into Autonomous Database.
Our Goals:
- To provision the Oracle Autonomous Data Warehouse (ADW) instance in Oracle Cloud Infrastructure Console (OCI)
- To understand the JSON fundamentals
You will also need a Twitter account with the data. However, this demand is optional. Instead of an active Twitter account, you can use the sample data we’ll provide in this article.
Our Workflow:
In this article, we’ll illustrate the following steps to accomplish our task of importing and analyzing the Twitter data:
- Download Twitter data
- Modify JSON Tweet data
- Import data with SQL Developer Web
- Build Twitter Activity Analysis
Download Twitter Data
If you have a Twitter account, you can extract the data using Twitter API or download archived data directly. Our tutorial will focus on the second option – direct download. Refer to the official instruction on downloading your Twitter archive.
You will have to enter your password. The data preparation process can take 24 hours or longer. When it is ready to download, you will be notified by email.
If you don’t have a Twitter account or won’t use it for some reason, you can download the dataset sample.
This dataset is from Kaggle. It is converted from CSV to JSON. The purpose of the dataset is to help you import the JSON data into the Autonomous Database.
There are the following 6 fields:
- target: the polarity of the tweet (0 = negative, 2 = neutral, 4 = positive)
- ids: the id of the tweet (2087)
- date: the date of the tweet (Sat May 16 23:58:44 UTC 2009)
- flag: The query. If there is no query, then this value is NO_QUERY.
- user: the user that tweeted (abcxyz)
- text: the text of the tweet (Twitter is cool)
Import Twitter Data into ADW
By default, the Twitter data is in JSON format. You need to import JSON data using SQL Developer Web or SQL Developer desktop.
Enabling User Access to SQL Developer Web
SQL Developer Web allows you to import JSON files into Autonomous Database directly. However, you can’t access it if ADMIN does not enable your user schema to access SQL Developer Web. In Autonomous, the ADMIN user can access the SQL Developer Web by default only. Other database users are always disabled.
Refer to the previous topic Provisioning and Oracle Autonomous Database for more information.
Thus, you need to use SQL Developer to access Autonomous Database with ADMIN user. If you are not ADMIN, you need to contact the OCI Admin or who is ADMIN of Autonomous to request user access.
Run the following code to enable User Access to SQL Developer Web:
BEGIN
ords_admin.enable_schema(
p_enabled => TRUE,
p_schema => 'schema-name',
p_url_mapping_type => 'BASE_PATH',
p_url_mapping_pattern => 'schema-alias',
p_auto_rest_auth => NULL
);
commit;
END;
In that code,
- schema-name is the database schema name in all-uppercase.
- schema-alias is an alias for the schema name that will appear in the URL that the user will apply to access SQL Developer Web.
Oracle recommends that you do not use the schema name itself for security reasons.
In our case, the database user is DEMO – enter the following code for schema-name and schema-alias:
After enabling user access, the ADMIN user needs to provide the enabled user with the URL to access SQL Developer Web.
In OCI, navigate to Oracle Database > Autonomous Database
Click on the ADWDEMO instance > go to the Detail page of ADW instance > click on Service Console
You are redirected to Service Console where you can access services integrated with Autonomous Database.
Navigate to Development > Database Actions. You continue redirecting to the SQL Developer Web page:
By default, the URL of SQL Developer Web is https://xxxxxxxxxx-adwdemo.adb.us-ashburn-1.oraclecloudapps.com/ords/sql-developer
Enter the enabled user (database user schema) and password to access. Here, you enable the DEMO schema:
Now you are ready to import JSON Twitter data into Autonomous Data Warehouse.
Importing the Twitter dataset
Click on Data Load to start importing data > Load Data and Local File
Browse to the JSON dataset to import:
Click on the Pencil icon to edit properties of the JSON file and provide the table that will be created as TWITTER_TWEET_USER. Keep the column name as default:
Close the properties pop-up to go back to the Data Load page, and click Run to start loading data.
When the process is complete, you’ll see the confirmation message.
Now, you can query the table to make sure that JSON data was parsed and inserted correctly. In the top-left corner, click on the hamburger icon to navigate to SQL:
Query data from the TWITTER_TWEET_USER table:
Build Twitter Sentiment Analysis
After loading the Twitter JSON dataset, you can build the Sentiment analysis with the Data Flow in Oracle Analytics.
Make sure you have created the connection to the ADW instance. Then you will only create a new data set for Twitter in Oracle Analytics:
Create a new Data Flow and add the Twitter dataset. Save the data flow as DF_Twitter_Sentiment_Analysis
Click the Plus icon for the Twitter dataset > Analyze Sentiment
Choose the TEXT column from the dataset for Text to Analyze
Continue Add Columns in the data flow to add a measure Count
The output of the data flow got additional Emotion and Count Columns. Now, you need to save the final dataset at the Save Data step.
Enter the Data Set name – Twitter Sentiment Data. Change all columns as Attribute except for the Count Column – it should be Measure.
Save the data flow and run it to generate the final dataset. Now, you can build the Twitter Sentiment Analysis with the new dataset generated.
Import the Twitter dataset
As you are already familiar with the SQL Developer Web interface, let’s explore your Twitter data. Whenever you receive the email notification about the Twitter Archived Data, you can download and import it.
Download and unzip the archived file. You will see two folders there:
- assets
- data
Most of the data files are in the data folder.
You can import the JSON files (all your tweets in Twitter, in this case). Open the data folder > open the tweet.js file with any Text Editor (e.g., Notepad).
In the context of this file, you will see a part of JSON data that including the following line:
window.YTD.tweet.part0 =
Before importing the JSON data into Autonomous, you need to remove this fragment in the context of the tweet.js file:
Save the tweet.js file and access SQL Developer Web within еру DEMO user as you did in the previous step.
In the top-left corner, click the Hamburger icon to navigate to SQL. Then, browse the tweet.js file to upload it.
By default, SQL Developer Web analyzes only the first level or attributes in the document itself. The tweet.js file has the following data structure:
The first level that is parsed by SQL Develop Web allows you to see only the tweet column extracted in the process. However, you also need the details, such as retweets, source, full_text, and more. In case of a complicated JSON structure, you can use Oracle’s SQL JSON parsing functionality that provides you with all functionality to parse the JSON file or slightly change the structure of the file to ingest it correctly.
Replace the }, { "tweet" : string by the comma (,) character:
Remove the "tweet" : { string in the first document and the } symbol in the last document of the file. The file might look like that:
Now, browse to the edited JSON file (tweet.js) to upload it:
Click Next to go to the Table Definition step.
Enter the table name TWITTER_MY_TWEET and choose the schema in the ADW instance.
You also see the table definition with columns and data types. Hence, you can keep the default definitions or change the data type for each column.
Click Next to go to the Review step and start loading the JSON data:
Click Finish and verify the data.
Conclusion
JSON is one of the most popular data structures, the standard data structure for most API services or Web services. That’s why Oracle has cared about providing fully functional tools to parse such JSON data and turn it into the formatted structure (tabular mode):
- The Data Load interface to import JSON files directly. However, there are restrictions in case of complicated JSON structure (multiple levels, subdocuments, etc.)
- SQL/JSON Function provides you with all the necessary functionality to parse the data
- The Autonomous JSON database is one of the autonomous database types supported by Oracle.
Hope that this article was helpful for you. Stay tuned, as we’ll continue the series of articles on this topic.
Tags: json, oracle autonomous database, sql developer web Last modified: September 08, 2022