Creating tokens by row with tweets #quanteda - token

I want to create a column of tokens by tweet with the quanteda package in R. Right now, when I tokenize, it gives me 25,000 rows for all my tweets. However, I want to just have a column filled with the tokens in the same row as the tweet.
I tried the tokens() function in quanteda and tried groupby tweet

Related

Seeking to create columns of tweets with just the tokens

This is my dataframe. I am hoping to create a column called tokens that has the tokens for each tweet in the rows.
enter image description here
I tried making the tokens a tibble.

Sorting QUERY output alongside user entered data in Google Sheets

I'm having trouble thinking about how to sort the results of a google sheets function alongside user entered data in adjacent cells.
Example sheet: https://docs.google.com/spreadsheets/d/1YfV93o8WEEgSG19WhbKOH1JBX5uIIz9YKVQLWSoA_Ik/edit?usp=sharing
For example, I've performed a case insensitive search function on a full list of raw data in columns Q to T using
=QUERY(Q:T, "WHERE (LOWER(Q) contains '"&LOWER(C12)&"')",2)
Image of implementation
This fills columns B,C,D,E as desired. In columns F,G I would like users to be able to add their name and the quantity of items they would like sent to them. I can sort the query results (let's say alphabetically by B) by just encapsulating it in SORT() or adding "order by" in the query, but the user entered data in columns F,G is not sorted along with this. Please could I have some ideas on how to achieve this? I'm at a loss after searching for existing questions.

Number increment in Google Sheets formula

In a Google Sheets database, I have a formula which I have built in order to allocate a reference number to a series of companies.
Each company should have its unique number in the form of RET00XX where XX will represent the unique company number. I would like these numbers to be sequential, starting on 1 and going on +1 after that.
Whenever a new company is inserted in the database, the formula should be able to attribute it a reference number. It should also be able to verify if the company already exists in the database and, if so, automatically attribute it the company's unique reference number, instead of creating a new one.
The company names are in cells of column B.
This is the formula I have built (an example of the one in row 2):
=ARRAYFORMULA(IF($B2<>"",IF((COUNTIF($B$1:$B1,$B2)>0),INDEX($A$1:$R2,MATCH($B2,$B$1:$B1,0),12),CONCATENATE("RET00",ROW($B2))),""))
The steps it takes are:
It verifies that column B in the correspondent row is not empty;
With the COUNTIF function, verifies that the company does not exist in any of the previous rows;
If the company does exist, it attributes the correspondent reference number through the INDEX function;
If the company doesn't exist, it attributes the company a new reference number with the CONCATENATE and ROW functions.
The formula is largely working, although there are some problems.
Users adding to this database have the habit of adding entries by inserting rows in the middle of the database. This makes it so, due to the way the formula is built, that company unique reference codes change each time that happens. I believe this is partially due to the fact that I use a ROW function. Also, given that new rows are inserted in the middle of the database, the formula should be able to verify is the company already exists not only by looping through all previous rows but rather through all rows (if a new row is inserted, the formula will only verify previous rows, when the company could be in the rows after the new one).
How can I attribute sequential numbers in a formula without reference to ROW? Also, how can I make sure that the spreadsheet verifies for all rows of column B instead of just the ones before the inserted row?
apply this formula in your sheets,
=ArrayFormula(if(B2:B<>"",row(A2:A)-1,""))
More information regarding this please visit this link : https://infoinspired.com/google-docs/spreadsheet/auto-serial-numbering-in-google-sheets/
Solution that is independent of starting row number
These examples will allow you to generate incrementing values in your formulas.
Incrementing integers, zero based:
The values will be: 0,1,2,3, etc.
Note: The address "$A$2" represents the cell of your top row. It should be changed to whatever cell your actual top row is. The nice thing about this method is it it will not break if you insert new rows above the start position of your formula.
=(ROW()-ROW($A$2))
Integers, one based:
The values will be: 1,2,3,4, etc.
=(ROW()-ROW($A$2) + 1)
Dates:
The values will be: 2000-01-01,2000-01-02,2000-01-03, etc.
=Date(2000,1,1) + (ROW()-ROW($A$2))
All Even Numbers:
The values will be: 0,2,4, etc.
=(ROW()-ROW($A$2) * 2
Short answer
Use Google Apps Script
Explanation
Using spreadsheet functions to set an ID on a live spreadsheet used as a database is very risky as the values will be recalculated when changes be made to the spreadsheet content.
Instead of using a formula use a script to add a "fixed value". Scripts could be called automatically on events like cell edits and row insertion, by using a custom menu or side panel, from the script editor or by time-driven triggers.
The following Q&A from Web Applications shows several ways to set a sequential number:
Can I add an autoincrement field to a Google Spreadsheet based on a Google Form?
This other from SO could be helpful too:
Auto incrementing Job Reference
Insert 1 in the first cell and paste the formula below in the following cells.
=INDIRECT(ADDRESS(ROW()-1,COLUMN())) + 1
Add number on very first row and type the formula from next cell
i used =A1+1 to get incremental number to index tasks on each line.

Sum form values in responses sheet

I am trying to sum incoming numbers that are user inputted via Google Forms and then transferred to a Google responses spreadsheet.
I have tried the basic functions in my attempts to solve this, but when new information drops into the responses spreadsheet, the formulas all move down a row since the incoming information is inputted at the highest row that hasn't received the Forms output.
If there is a way to sum the incoming data on a Google spreadsheet that would be great.
If the values you wish to sum happen to be in ColumnsD:E of the responses sheet then:
=ArrayFormula(C2:C101+D2:D101+E2:E101)
in Row2, after a response has been received, of a spare spare column may suit.

Can I get a column formula for a fusion table from the API

I'm trying to backup one of my fusion tables and would like to be able to use the API to do so. The trouble is I am not able to get the formula for a particular column, just the values.
I'm able to describe the table and see the column info (id, name, and type) and then do a SELECT and get all the information in the table, but what comes back is the value, not the formula. Is there any way I can get the formula?
It is not possible to retrieve the formula in a column from the API. Please feel free to open a feature request on the list:
http://code.google.com/p/fusion-tables/issues/list

Resources