I'm trying to make a custom command in mod_admin_extra.erl. to fetch messages between 2 JIDs.
My command will look like this:-
ejabberdctl get_messages HOST FROM TO START_TIME END_TIME
The SQL query will be like:-
select * from archive where (username = FROM and bare_peer = TO) OR (username=TO and bare_peer = FROM) where created_at BETWEEN START_TIME AND END_TIME;
I went through this thread to understand how IQ query works and want to build a similar sort of a thing via the command and API.
How do I fire the query in the above function so as to fetch the messages between the conversations of 2 JIDs??
My response would be a list of dictionaries:-
[{from: jid1, to: jid2, body: Hello, created_at: T1}]
I would be in turn using the same command for the POST API to fetch messages.
UPDATE
As per the suggestion provided by #Badlop, I updated my function with
% ----------------- Custom Command Get Message ----------------------
#ejabberd_commands{name = get_message, tags = [stanza],
desc = "Get messages from a local or remote bare of full JID",
longdesc = "Get messages of a specific JID sent to a JID",
module = ?MODULE, function = get_message,
args = [{host, binary}, {from, binary}, {to, binary},
{start_time, binary}, {end_time, binary}],
args_example = [<<"localhost">>, <<"admin">>, <<"user1">>,
<<"2015-07-00T00:00:00Z">>, <<"2015-07-029T13:23:54Z">>],
args_desc = ["Host", "From JID", "Receiver JID", "Start Time", "End Time"],
result = {result, {
tuple, [{messages, list, {message, {tuple,
[
{timestamp, string},
{xml, string},
{txt, string},
{peer, integer},
{kind, integer},
{nick, string}
]}}},
{status, string},
{count, integer}]}}
},
% ----------------- Custom Command Ends -----------------------------
This is my function that gets called when the command is received.
% ----------------- Custom Function Get Message ----------------------
get_message(Host, From, To, StartTime, EndTime) ->
mod_mam:select(
Host,
jid:make(From, Host),
jid:make(From, Host),
[{start, xmpp_util:decode_timestamp(StartTime)},
{'end', xmpp_util:decode_timestamp(EndTime)},
{with, jid:make(To, Host)}],
#rsm_set{},
chat,
all
).
% ----------------- Custom Function Get Message ----------------------
However, it returns an error response:-
Unhandled exception occurred executing the command:
** exception error: no function clause matching
ejabberd_ctl:format_result([],
{messages,list,
{message,
{tuple,
[{timestamp,string},
{xml,string},
{peer,integer},
{kind,integer},
{nick,string}]}}}) (src/ejabberd_ctl.erl, line 405)
in function ejabberd_ctl:format_result/2 (src/ejabberd_ctl.erl, line 461)
in call from ejabberd_ctl:try_call_command/4 (src/ejabberd_ctl.erl, line 321)
in call from ejabberd_ctl:process2/4 (src/ejabberd_ctl.erl, line 274)
in call from ejabberd_ctl:process/2 (src/ejabberd_ctl.erl, line 252)
in call from rpc:'-handle_call_call/6-fun-0-'/5 (rpc.erl, line 197)
The query printed in the logs in as follow:-
2020-04-24 21:57:13.717746+05:30 [debug] SQL: "SELECT timestamp, xml, peer, kind, nick FROM archive WHERE username=E'admin' and server_host=E'localhost' and bare_peer=E'test#localhost' and timestamp >= 1587692943312536 and timestamp <= 1587779343312536 ORDER BY timestamp ASC ;"
2020-04-24 21:57:13.726745+05:30 [debug] SQL: "SELECT COUNT(*) FROM archive WHERE username=E'admin' and server_host=E'localhost' and bare_peer=E'test#localhost' and timestamp >= 1587692943312536 and timestamp <= 1587779343312536;"
This is database independent:
mod_mam:select(
<<"localhost">>,
jid:make(<<"user1">>, <<"localhost">>),
jid:make(<<"user1">>, <<"localhost">>),
[{start, xmpp_util:decode_timestamp(<<"2020-04-24T14:37:25Z">>)},
{'end', xmpp_util:decode_timestamp(<<"2020-04-24T14:37:30Z">>)},
{with, jid:make(<<"user2">>,<<"localhost">>)}],
#rsm_set{},
chat,
all
).
Umm, you were still far away, the command result was wrong, and the call result must be processed. what about this?
$ ejabberdctl get_mam_messages user1#localhost user2#localhost 2020-04-27T00:00:00Z 2020-04-27T23:59:59Z
Required patch:
diff --git a/src/mod_mam.erl b/src/mod_mam.erl
index 08a4059b4..d2d74913c 100644
--- a/src/mod_mam.erl
+++ b/src/mod_mam.erl
## -42,6 +42,7 ##
get_room_config/4, set_room_option/3, offline_message/1, export/1,
mod_options/1, remove_mam_for_user_with_peer/3, remove_mam_for_user/2,
is_empty_for_user/2, is_empty_for_room/3, check_create_room/4,
+ get_messages_command/4,
process_iq/3, store_mam_message/7, make_id/0, wrap_as_mucsub/2, select/7]).
-include("xmpp.hrl").
## -1355,8 +1356,29 ## get_jids(undefined) ->
get_jids(Js) ->
[jid:tolower(jid:remove_resource(J)) || J <- Js].
+get_messages_command(From, To, StartTime, EndTime) ->
+ FromJid = jid:decode(From),
+ {Stanzas, _, _} =
+ mod_mam:select(
+ FromJid#jid.lserver, FromJid, FromJid,
+ [{start, xmpp_util:decode_timestamp(StartTime)},
+ {'end', xmpp_util:decode_timestamp(EndTime)},
+ {with, jid:decode(To)}],
+ #rsm_set{}, chat, all),
+ [fxml:element_to_binary(xmpp:encode(Subels))
+ || {_, _, #forwarded{sub_els = [Subels]}} <- Stanzas].
+
get_commands_spec() ->
- [#ejabberd_commands{name = delete_old_mam_messages, tags = [purge],
+ [#ejabberd_commands{
+ name = get_mam_messages, tags = [mam],
+ desc = "Get archived messages of an account with another contact",
+ module = ?MODULE, function = get_messages_command,
+ args = [{from, binary}, {to, binary}, {start, binary}, {'end', binary}],
+ args_example = [<<"user1#localhost">>, <<"user2#example.org">>,
+ <<"2020-04-27T00:00:00Z">>, <<"2020-04-27T23:59:59Z">>],
+ args_desc = ["Local JID", "Contact JID", "Start Time", "End Time"],
+ result = {messages, {list, {message, string}}}},
+ #ejabberd_commands{name = delete_old_mam_messages, tags = [purge],
desc = "Delete MAM messages older than DAYS",
longdesc = "Valid message TYPEs: "
"\"chat\", \"groupchat\", \"all\".",
Related
About once a month I get a google drive folder with lots of videos in it (usually around 700-800) and a spreadsheet that column A gets populated with the names of all of the video files in order of the time stamp in the video file name. Now I've already got the code that does this (I will post it below) but This time I've got about 8,400 video files in the folder and this algorithm has a pageSize limit of 1,000 (it was originally 100, I changed it to 1,000 but that's the highest it will accept) How do I change this code to accept more than 1000
This is the part that initializes everything
!pip install gspread_formatting
import time
import gspread
from gspread import urls
from google.colab import auth
from datetime import datetime
from datetime import timedelta
from gspread_formatting import *
from googleapiclient.discovery import build
from oauth2client.client import GoogleCredentials
from google.auth import default
folder_id = '************************' # change to whatever folder the required videos are in
base_dir = '/Example/drive/videofolder' # change this to whatever folder path you want to grab videos from same as above
file_name_qry_filter = "name contains 'mp4' and name contains 'cam'"
file_pattern="cam*.mp4"
spreadSheetUrl = 'https://docs.google.com/spreadsheets/d/SpreadsheetIDExample/edit#gid=0'
data_drive_id = '***********' # This is the ID of the shared Drive
auth.authenticate_user()
creds, _ = default()
gc = gspread.authorize(creds)
#gc = gspread.authorize(GoogleCredentials.get_application_default())
wb = gc.open_by_url(spreadSheetUrl)
sheet = wb.worksheet('Sheet1')
And this is the main part of the code
prevTimeStamp = None
prevHour = None
def dateChecker(fileName, prevHour):
strippedFileName = fileName.strip(".mp4") # get rid of the .mp4 from the end of the file name
parsedFileName = strippedFileName.split("_") # split the file name into an array of (0 = Cam#, 1 = yyyy-mm-dd, 2 = hh-mm-ss)
timeStamp = parsedFileName[2] # Grabbed specifically the hh-mm-ss time section from the original file name
parsedTimeStamp = timeStamp.split("-") # split the time stamp into an array of (0 = hour, 1 = minute, 2 = second)
hour = int(parsedTimeStamp[0])
minute = int(parsedTimeStamp[1])
second = int(parsedTimeStamp[2]) # set hour, minute, and seccond to it's own variable
commentCell = "Reset"
if prevHour == None:
commentCell = " "
prevHour = hour
else:
if 0 <= hour < 24:
if hour == 0:
if prevHour == 23:
commentCell = " "
else:
commentCell = "Missing Video1"
else:
if hour - prevHour == 1:
commentCell = " "
else:
commentCell = "Missing Video2"
else:
commentCell = "Error hour is not between 0 and 23"
if minute != 0 or 1 < second <60:
commentCell = "Check Length"
prevHour = hour
return commentCell, prevHour
# Drive query variables
parent_folder_qry_filter = "'" + folder_id + "' in parents" #you shouldn't ever need to change this
query = file_name_qry_filter + " and " + parent_folder_qry_filter
drive_service = build('drive', 'v3')
# Build request and call Drive API
page_token = None
response = drive_service.files().list(q=query,
corpora='drive',
supportsAllDrives='true',
includeItemsFromAllDrives='true',
driveId=data_drive_id,
pageSize=1000,
fields='nextPageToken, files(id, name, webViewLink)', # you can add extra fields in the files() if you need more information about the files you're grabbing
pageToken=page_token).execute()
i = 1
array = [[],[]]
# Parse/print results
for file in response.get('files', []):
array.insert(i-1, [file.get('name'), file.get('webViewLink')]) # If you add extra fields above, this is where you will have to start changing the code to make it accomadate the extra fields
i = i + 1
array.sort()
array_sorted = [x for x in array if x] #Idk man this is some alien shit I just copied it from the internet and it worked, it somehow removes any extra blank objects in the array that aren't supposed to be there
arrayLength = len(array_sorted)
print(arrayLength)
commentCell = 'Error'
# for file_name in array_sorted:
# date_gap, start_date, end_date = date_checker(file_name[0])
# if prev_end_date == None:
# print('hello')
# elif start_date != prev_end_date:
# date_gap = 'Missing Video'
for file_name in array_sorted:
commentCell, prevHour = dateChecker(file_name[0], prevHour)
time.sleep(0.3)
#insertRow = [file_name[0], "Not Processed", " ", date_gap, " ", " ", " ", " ", base_dir + '/' + file_name[0], " ", file_name[1], " ", " ", " "]
insertRow = [file_name[0], "Not Processed", " ", commentCell, " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " "]
sheet.append_row(insertRow, value_input_option='USER_ENTERED')
Now I know the problem has to do with the
page_token = None
response = drive_service.files().list(q=query,
corpora='drive',
supportsAllDrives='true',
includeItemsFromAllDrives='true',
driveId=data_drive_id,
pageSize=1000,
fields='nextPageToken, files(id, name, webViewLink)', # you can add extra fields in the files() if you need more information about the files you're grabbing
pageToken=page_token).execute()
In the middle of the main part of the code. I've obviously already tried just changing the pageSize limit to 10,000 but I knew that wouldn't work and I was right, it came back with
HttpError: <HttpError 400 when requesting https://www.googleapis.com/drive/v3/files?q=name+contains+%27mp4%27+and+name+contains+%27cam%27+and+%271ANmLGlNr-Cu0BvH2aRrAh_GXEDk1nWvf%27+in+parents&corpora=drive&supportsAllDrives=true&includeItemsFromAllDrives=true&driveId=0AF92uuRq-00KUk9PVA&pageSize=10000&fields=nextPageToken%2C+files%28id%2C+name%2C+webViewLink%29&alt=json returned "Invalid value '10000'. Values must be within the range: [1, 1000]". Details: "Invalid value '10000'. Values must be within the range: [1, 1000]">
The one idea I have is to have multiple pages with 1000 each and than iterate through them but I barely understood how this part of the code worked a year ago when I set it up and since than I haven't touched google colab except to run this algorithm and Every time I try to google how to do this or look up the google drive API or anything else everything always comes back with how to download and upload a couple file where what I need is just to get a list of the names of all the files.
The documentation explains how to use the pageToken for pagination (the page is for Calendar API but it works the same in Drive):
In order to retrieve the next page, perform the exact same request as previously and append a pageToken field with the value of nextPageToken from the previous page. A new nextPageToken is provided on the following pages until all the results are retrieved.
Essentially you want a loop where you run files.list(), retrieve the pageToken, and run it again while feeding it the previous token until you stop getting tokens.
For your specific scenario you can try to replace the "problem" snippet with the following:
page_token = ""
filelist = {}
while True:
response = drive_service.files().list(q=query,
corpora='drive',
supportsAllDrives='true',
includeItemsFromAllDrives='true',
driveId=data_drive_id,
pageSize=1000,
fields='nextPageToken, files(id, name, webViewLink)',
pageToken=page_token).execute()
page_token = response.get('nextPageToken', None)
filelist.setdefault("files",[]).extend(response.get('files'))
if (not page_token):
break
response = filelist
This does as I described, looping files.list() and adding the results to the filelist variable, then breaking the loop when the API stops returning page tokens. At the end I just assigned the value of filelist to the response variable since that's what you're using in the rest of your code. It should parse the same way but with the full list of results this time.
Sources:
Page through list of resources
Files.list()
I wanted to take none phrases of tweets, code is following. The problem is that it only process 300 tweets at a time and spend 5 minutes, how to speed up?
by the way, some code edited according to text blob.
I use dataset of gate-EN-twitter(https://gate.ac.uk/wiki/twitter-postagger.html) and NLTK interface to the Stanford POS tagger to tag tweets
from nltk.tag import StanfordPOSTagger
from nltk.tokenize import word_tokenize
import time,nltk
start_time = time.time()
CFG = {
('NNP', 'NNP'): 'NNP',
('NN', 'NN'): 'NNI',
('NNI', 'NN'): 'NNI',
('JJ', 'JJ'): 'JJ',
('JJ', 'NN'): 'NNI',
}
st = StanfordPOSTagger('/models/gate-EN-twitter.model','/twitie_tagger/twitie_tag.jar', encoding='utf-8')
def _normalize_tags(chunk):
'''Normalize the corpus tags.
("NN", "NN-PL", "NNS") -> "NN"
'''
ret = []
for word, tag in chunk:
if tag == 'NP-TL' or tag == 'NP':
ret.append((word, 'NNP'))
continue
if tag.endswith('-TL'):
ret.append((word, tag[:-3]))
continue
if tag.endswith('S'):
ret.append((word, tag[:-1]))
continue
ret.append((word, tag))
return ret
def noun_phrase_count(text):
matches1=[]
print('len(text)',len(text))
for i in range(len(text)//1000):
tokenized_text = word_tokenize(text[i*1000:i*10000+1000])
classified_text = st.tag(tokenized_text)
tags = _normalize_tags(classified_text)
merge = True
while merge:
merge = False
for x in range(0, len(tags) - 1):
t1 = tags[x]
t2 = tags[x + 1]
key = t1[1], t2[1]
value = CFG.get(key, '')
if value:
merge = True
tags.pop(x)
tags.pop(x)
match = '%s %s' % (t1[0], t2[0])
pos = value
tags.insert(x, (match, pos))
break
matches = [t[0] for t in tags if t[1] in ['NNP', 'NNI']]
matches1+=matches
print("--- %s seconds ---" % (time.time() - start_time))
fdist = nltk.FreqDist(matches1)
return [(tag,num) for (tag, num) in fdist.most_common()]
noun_phrase_count(tweets)
Looks like a duplicate of Stanford POS tagger with GATE twitter model is slow so you may find more info there.
Additionally; if there's any chance of stumbling upon identical inputs (tweets) twice (or more), you can consider a dictionary with the tweet (plain str) as key, and tagged as value, so that when you encounter a tweet, you first check if it's in your dict already. If not, tag it and put it there (and if this route is viable, why not pickle/unpickle that dictionary so that debugging/subsequent runs of your code go faster as well).
I want to build a customer support chat app. There are users and an admin. Below admin there are multiple sub-admins. Initially the chat is initiated with admin only, but if the admin is offline I need to route the message to sub-admins.
offline_message_hook hook serves the purpose. I'll check if the To is admin, then I need to route the Packet to one of the sub-admins. How do I route/send the packet to other user within offline_message_hook. In short how do I change the To from the packet so that the packet is re-directed to the new sub-admin?
Here is what I've tried:-
offline_message_hook({_Action, #message{from = Peer, to = To} = Pkt} = Acc) ->
?INFO_MSG("Inside offline", []),
ejabberd_router:route(From, To, Packet),
ok.
I'm using ejabberd 17.04.105.
Update
After following user2610053's advice, I did this:-
-spec offline_message_hook({any(), message()}) -> {any(), message()}.
offline_message_hook({_Action, Msg} = Acc) ->
ejabberd_router:route(xmpp:set_to(Msg, 'praful#localhost')),
{routed, Msg}.
Following is the error:-
15:13:12.291 [error] failed to route packet:
#message{id = <<"purple187f6502">>,type = chat,lang = <<"en">>,
from = {jid,<<"praful2">>,<<"localhost">>,<<"Prafuls-MacBook-Pro">>,
<<"praful2">>,<<"localhost">>,<<"Prafuls-MacBook-Pro">>},
to = praful#localhost,subject = [],
body = [#text{lang = <<>>,data = <<"co=umon">>}],
thread = undefined,
sub_els = [{xmlel,<<"active">>,
[{<<"xmlns">>,
<<"http://jabber.org/protocol/chatstates">>}],
[]}],
meta = #{ip => {0,0,0,0,0,0,0,1}}}
Reason = {error,{{badrecord,jid},[{ejabberd_router,do_route,1,[{file,"src/ejabberd_router.erl"},{line,343}]},{ejabberd_router,route,1,[{file,"src/ejabberd_router.erl"},{line,87}]},{mod_sunshine,offline_message_hook,1,[{file,"src/mod_sunshine.erl"},{line,24}]},{ejabberd_hooks,safe_apply,4,[{file,"src/ejabberd_hooks.erl"},{line,380}]},{ejabberd_hooks,run_fold1,4,[{file,"src/ejabberd_hooks.erl"},{line,364}]},{ejabberd_sm,route,1,[{file,"src/ejabberd_sm.erl"},{line,138}]},{ejabberd_local,route,1,[{file,"src/ejabberd_local.erl"},{line,116}]},{ejabberd_router,do_route,1,[{file,"src/ejabberd_router.erl"},{line,348}]}]}}
The user praful#localhost exist. Please advice what exactly is wrong?
Update2 - `UserReceivePacket Hook
In user_receive_packet packet hook, upon using the same function ejabberd_router:route(xmpp:set_to(Packet, jid:decode("praful#localhost"))), it throws an error saying :-
Hook user_receive_packet crashed when running mod_sunshine:user_receive_packet/1:
** Reason = {error,function_clause,[{jid,decode,[{file,"src/jid.erl"},{line,132}],["praful#localhost"]},{mod_sunshine,user_receive_packet,[{file,"src/mod_sunshine.erl"},{line,29}],1},{ejabberd_hooks,safe_apply,[{file,"src/ejabberd_hooks.erl"},{line,380}],4},{ejabberd_hooks,run_fold1,[{file,"src/ejabberd_hooks.erl"},{line,364}],4},{ejabberd_c2s,process_info,[{file,"src/ejabberd_c2s.erl"},{line,231}],2},{ejabberd_hooks,safe_apply,[{file,"src/ejabberd_hooks.erl"},{line,380}],4},{ejabberd_hooks,run_fold1,[{file,"src/ejabberd_hooks.erl"},{line,364}],4},{xmpp_stream_in,handle_info,[{file,"src/xmpp_stream_in.erl"},{line,373}],2}]}
So, I read about function_clause, but couldnt understand the same. What exactly is wrong over here?
I think you're asking about xmpp:set_to/2. Here is an example:
offline_message_hook({_Action, Msg} = Acc) ->
SubAdmins = get_sub_admins(Msg#message.to),
lists:foreach(
fun(Admin) ->
ejabberd_router:route(xmpp:set_to(Msg, Admin))
end, Admins),
{routed, Msg}.
i selected with dropwdonlist and textbox
from sqlserver to another form in asp vb.net
but give me error
incorrect syntax near like
script is that
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
If Len(Session("LibuserID")) = 0 Then
Response.Redirect("./index.aspx")
End If
Dim DBConn As SqlConnection
Dim DBCommand As SqlDataAdapter
Dim DSPageData As New DataSet
DBConn = New SqlConnection("Data Source=localhost;" & _
"initial catalog=test;Integrated Security=True;")
If Request.QueryString("Type") = "Search" Then
lblMessage.Text = "Resultati Poiska:"
DBCommand = New SqlDataAdapter _
("Select LibBookID,BookTitle,Author,Status " _
& "from LibBooks where " _
& Request.QueryString("ddlSearchField") & "Like '%" _
& Replace(Request.QueryString("txtSearchText"), "'", "''") _
& "&' order by BookTitle", DBConn)
ElseIf Request.QueryString("Type") = "Browse" Then
lblMessage.Text = "kniqi otnosyasiesya k etoy kategorii:"
DBCommand = New SqlDataAdapter _
("select LibBookID,BookTitle,Author,Status " _
& "from LibBooks where " _
& "LibBookCategoryID = " _
& Request.QueryString("LibBookCategoryID") _
& "Order By BookTitle", DBConn)
Else
Response.Redirect("./menu.aspx")
End If
DBCommand.Fill(DSPageData, _
"Books")
dbBooks.DataSource = _
DSPageData.Tables("Books").DefaultView
dbBooks.DataBind()
End Sub
error is that
Incorrect syntax near the keyword 'Like'.
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.Data.SqlClient.SqlException: Incorrect syntax near the keyword 'Like'.
Source Error:
Line 33: Response.Redirect("./menu.aspx")
Line 34: End If
Line 35: DBCommand.Fill(DSPageData, _
Line 36: "Books")
Line 37: dbBooks.DataSource = _
Put a space before your LIKE clause.
& Request.QueryString("ddlSearchField") & " Like '%" _
As someone is saying you should use a parametrized query instead of this.
Probably your problem is thatRequest.QueryString("ddlSearchField") is null or empty, so if you want to change your query into a parametrized you have to rewrite all, if you just want it to work you have to check if values are null or empty.
I try to get TO: user#mail.com field from mime mail message.
I have code:
parse_to(Data) ->
List = string:tokens(Data, ":"),
Sep1 = lists:map(fun(H) ->string:tokens(H, ":") end, List),
io:format(Sep1),
Sep2 = lists:filter(fun ([K | _]) -> K == "To" end, Sep1),
ListAddress = lists:append(Sep2),
[_ | Tail] = ListAddress,
lists:map(fun(Address) -> string:tokens(Address, ",") end, Tail).
If i have short message for example: https://gist.github.com/865910
I got in io:format(Sep1) https://gist.github.com/865905, it's ok all without :
But if i have long message with attachment: - https://gist.github.com/865914
I got in io:format(Sep1) - https://gist.github.com/865906 everything remains the same as it was with :
What's wrong? Why shot message normal parse and big message not parsed?
When i try use regexp:
List = binary_to_list(Binary),
re:run(List, "^To: (.)*$", [multiline, {capture, all_but_first, list}]).
I get only {match, ["m"]}
Why?
Thank you.
Try a regular expression:
1> Data = <<"...">> % Your long message
<<"...">>
2> re:run(Data, <<"^To: (.*)$">>, [multiline, {capture, all_but_first, binary}]).
{match,[<<"shk#shk.dyndns-mail.com">>]}