Indy10 > How to wait after WriteLn() call - c++builder

My Environment:
Windows 7 Pro (32bit)
C++ Builder XE4
I would like to know about wait after WriteLn();
Following is my sample code.
void __fastcall TForm1::IdTCPServer1Execute(TIdContext *AContext)
{
UTF8String rcvdStr;
rcvdStr = AContext->Connection->IOHandler->ReadLn(
IndyTextEncoding(TEncoding::UTF8) );
TList *threads;
TIdContext *ac;
threads = IdTCPServer1->Contexts->LockList();
ac = reinterpret_cast<TIdContext *>(threads->Items[0]);
UTF8String sendStr;
sendStr = "send:" + rcvdStr;
ac->Connection->IOHandler->WriteLn(sendStr);
for(int idx=0; idx<10; idx++) {
Sleep(100);
Application->ProcessMessages();
}
ac->Connection->Disconnect();
IdTCPServer1->Contexts->UnlockList();
}
//---------------------------------------------------------------------------
I am putting wait (for(int idx=0;...) after WriteLn() so that the sending should be completed before Disconnection. However, I am not sure whether this is a correct way to wait. Also I have no idea how long should I wait (in this sample, I wait 1000 msec).
Question: Are there any function to know completion of WriteLn()?

You don't need to wait at all. WriteLn() is a blocking function. It does not exit until the entire string has been placed into the socket's outbound buffer. By default, a socket's LINGER option is enabled, which means a closed socket will attempt to send pending outbound data in the background before fully closing the port, even after your code has moved on.
Refer to MSDN For more details:
Graceful Shutdown, Linger Options, and Socket Closure
For the record, Indy's Disconnect() does use both shutdown() and closesocket().

Related

NumberOfConcurrentThreads parameter in CreateIoCompletionPort

I am still confused about the NumberOfConcurrentThreads parameter within CreateIoCompletionPort(). I have read and re-read the MSDN dox, but the quote
This value limits the number of runnable threads associated with the
completion port.
still puzzles me.
Question
Let's assume that I specify this value as 4. In this case, does this mean that:
1) a thread can call GetQueuedCompletionStatus() (at which point I can allow a further 3 threads to make this call), then as soon as that call returns (i.e. we have a completion packet) I can then have 4 threads again call this function,
or
2) a thread can call GetQueuedCompletionStatus() (at which point I can allow a further 3 threads to make this call), then as soon as that call returns (i.e. we have a completion packet) I then go on to process that packet. Only when I have finished processing the packet do I then call GetQueuedCompletionStatus(), at which point I can then have 4 threads again call this function.
See my confusion? Its the use of the phrase 'runnable threads'.
I think it might be the latter, because the link above also quotes
If your transaction required a lengthy computation, a larger
concurrency value will allow more threads to run. Each completion
packet may take longer to finish, but more completion packets will be
processed at the same time.
This will ultimately affect how we design servers. Consider a server that receives data from clients, then echoes that data to logging servers. Here is what our thread routine could look like:
DWORD WINAPI ServerWorkerThread(HANDLE hCompletionPort)
{
DWORD BytesTransferred;
CPerHandleData* PerHandleData = nullptr;
CPerOperationData* PerIoData = nullptr;
while (TRUE)
{
if (GetQueuedCompletionStatus(hCompletionPort, &BytesTransferred,
(PULONG_PTR)&PerHandleData, (LPOVERLAPPED*)&PerIoData, INFINITE))
{
// OK, we have 'BytesTransferred' of data in 'PerIoData', process it:
// send the data onto our logging servers, then loop back around
send(...);
}
}
return 0;
}
Now assume I have a four core machine; if I leave NumberOfConcurrentThreads as zero within my call to CreateIoCompletionPort() I will have four threads running ServerWorkerThread(). Fine.
My concern is that the send() call may take a long time due to network traffic. Hence, I could be receiving a load of data from clients that cannot be dequeued because all four threads are taking a long time sending the data on?!
Have I missed the point here?
Update 07.03.2018 (This has now been resolved: see this comment.)
I have 8 threads running on my machine, each one runs the ServerWorkerThread():
DWORD WINAPI ServerWorkerThread(HANDLE hCompletionPort)
{
DWORD BytesTransferred;
CPerHandleData* PerHandleData = nullptr;
CPerOperationData* PerIoData = nullptr;
while (TRUE)
{
if (GetQueuedCompletionStatus(hCompletionPort, &BytesTransferred,
(PULONG_PTR)&PerHandleData, (LPOVERLAPPED*)&PerIoData, INFINITE))
{
switch (PerIoData->Operation)
{
case CPerOperationData::ACCEPT_COMPLETED:
{
// This case is fired when a new connection is made
while (1) {}
}
}
}
I only have one outstanding AcceptEx() call; when that gets filled by a new connection I post another one. I don't wait for data to be received in AcceptEx().
I create my completion port as follows:
CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 4)
Now, because I only allow 4 threads in the completion port, I thought that because I keep the threads busy (i.e. they do not enter a wait state), when I try and make a fifth connection, the completion packet would not be dequeued hence would hang! However this is not the case; I can make 5 or even 6 connections to my server! This shows that I can still dequeue packets even though my maximum allowed number of threads (4) are already running? This is why I am confused!
the completion port - is really KQUEUE object. the NumberOfConcurrentThreads is corresponded to MaximumCount
Maximum number of concurrent threads the queue can satisfy waits for.
from I/O Completion Ports
When the total number of runnable threads associated with the
completion port reaches the concurrency value, the system blocks the
execution of any subsequent threads associated with that completion
port until the number of runnable threads drops below the concurrency
value.
it's bad and not exactly said. when thread call KeRemoveQueue ( GetQueuedCompletionStatus internal call it) system return packet to thread only if Queue->CurrentCount < Queue->MaximumCount even if exist packets in queue. system not blocks any threads of course. from another side look for KiInsertQueue - even if some threads wait on packets - it activated only in case Queue->CurrentCount < Queue->MaximumCount.
also look how and when Queue->CurrentCount is changed. look for KiActivateWaiterQueue (This function is called when the current thread is about to enter a wait state) and KiUnlinkThread. in general - when thread begin wait for any object (or another queue) system call KiActivateWaiterQueue - it decrement CurrentCount and possible (if exist packets in queue and became Queue->CurrentCount < Queue->MaximumCount and threads waited for packets) return packet to wait thread. from another side, when thread stop wait - KiUnlinkThread is called. it increment CurrentCount.
your both variant is wrong. any count of threads can call GetQueuedCompletionStatus(). and system of course not blocks the execution of any subsequent threads. for example - you have queue with MaximumCount = 4. you can queue 10 packets to queue. and call GetQueuedCompletionStatus() from 7 threads in concurrent. but only 4 from it got packets. another will be wait (despite yet 6 packets in queue). if some of threads, which remove packet from queue begin wait - system just unwait and return packet to another thread wait on queue. or if thread (which already previous remove packet from this queue (Thread->Queue == Queue) - so active thread) again call KeRemoveQueue will be Queue->CurrentCount -= 1;

how to properly wait for completion of NtCreateFile/etc?

I am using native NT API in my application to access files (NtCreateFile/etc). In order to avoid dealing with STATUS_PENDING I am using FILE_SYNCHRONOUS_IO_NONALERT flag when opening related file. So, opening file looks like this:
UNICODE_STRING fname = toNtUnicode(ntpath);
OBJECT_ATTRIBUTES oa;
InitializeObjectAttributes(&oa, &fname, 0, at.handle(), NULL);
HANDLE h;
IO_STATUS_BLOCK io_status;
NTSTATUS r = NtOpenFile(&h, GENERIC_READ|SYNCHRONIZE, &oa, &io_status,
FILE_SHARE_READ, FILE_SYNCHRONOUS_IO_NONALERT|FILE_DIRECTORY_FILE);
if (r != STATUS_SUCCESS)
...; // error handling
Unfortunately, it causes kernel to serialize all operations on given handle. I.e. if I try to execute multiple reads in parallel (using multiple threads) -- only one request will be processed at any point in time.
I could get rid of serialization:
HANDLE h;
IO_STATUS_BLOCK io_status;
NTSTATUS r = NtOpenFile(&h, GENERIC_READ|SYNCHRONIZE, &oa, &io_status,
FILE_SHARE_READ, FILE_DIRECTORY_FILE);
if (r == STATUS_PENDING)
...; // what to do here???
but how exactly should I wait for completion -- WaitForSingleObject() on file handle? As far as I know it can change to signaled state due to many reasons -- is there any way to tell that my open file (or dir) operation completed?
Similarly, if I submit multiple reads (from multiple threads) -- how can I tell which one (if any) has finished?
NtOpenFile is synchronous api. it never return STATUS_PENDING to you. even if driver return STATUS_PENDING for IRP_MJ_CREATE i/o sub-system will be wait for IRP complete
https://github.com/Zer0Mem0ry/ntoskrnl/blob/master/Io/iomgr/parse.c#L1404
so you never need check for STATUS_PENDING after NtOpenFile and never need wait (and in principle we can not wait here - we yet not have file handle -so can not wait on it or bind it to say IOCP. we not pass any event or another callback mechanism for NtOpenFile)

Odd behavior when creating and cancelling a thread in close succession

I'm using g++ version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) and libpthread v. 2-11-1. The following code simply creates a thread running Foo(), and immediately cancels it:
void* Foo(void*){
printf("Foo\n");
/* wait 1 second, e.g. using nanosleep() */
return NULL;
}
int main(){
pthread_t thread;
int res_create, res_cancel;
printf("creating thread\n);
res_create = pthread_create(&thread, NULL, &Foo, NULL);
res_cancel = pthread_cancel(thread);
printf("cancelled thread\n);
printf("create: %d, cancel: %d\n", res_create, res_cancel);
return 0;
}
The output I get is:
creating thread
Foo
Foo
cancelled thread
create: 0, cancel: 0
Why the second Foo output? Am I abusing the pthread API by calling pthread_cancel right after pthread_create? If so, how can I know when it's safe to touch the thread? If I so much as stick a printf() between the two, I don't have this problem.
I cannot reproduce this on a slightly newer Ubuntu. Sometimes I get one Foo and sometimes none. I had to fix a few things to get your code to compile (missing headers, missing call to some sleep function implied by a comment and string literals not closed), which indicate you did not paste the actual code which reproduced the problem.
If the problem is indeed real, it might indicate some thread cancellation problem in glibc's IO library. It looks a lot like two threads doing a flush(stdout) on the same buffer contents. Now that should never happen normally because the IO library is thread safe. But what if there is some cancellation scenario like: the thread has the mutex on stdout, and has just done a flush, but has not updated the buffer yet to clear the output. Then it is canceled before it can do that, and the main thread flushes the same data again.

recv() windows socket takes infinite time - how to timeout?

I use file descriptors to find the readable sockets and go on to read. For some reasons, a socket that has no data on the wire, goes on to read and never returns. Is there a way I can come out of the receive after a timeout?
I am using winsock library..
http://tangentsoft.net/wskfaq/newbie.html#timeout
2.15 - How can I change the timeout for a Winsock function?
Some of the blocking Winsock functions (e.g. connect()) have a timeout embedded into them. The theory behind this is that only the stack has all the information necessary to set a proper timeout. Yet, some people find that the value the stack uses is too long for their application; it can be a minute or longer.
You can adjust the send() and recv() timeouts with the SO_SNDTIMEO and SO_RCVTIMEO setsockopt() options. .
For other Winsock functions, the best solution is to avoid blocking sockets altogether. All of the non-blocking socket methods provide ways for you to build custom timeouts:
Non-blocking sockets with select() – The fifth parameter to the select() function is a timeout value.
Asynchronous sockets – Use the Windows API SetTimer().
Event objects – WSAWaitForMultipleEvents() has a timeout parameter.
Waitable Timers – Call CreateWaitableTimers() to make a waitable timer, which you can then pass to a function like WSAEventSelect() along with your sockets: if none of the sockets is signalled before the timer goes off, the blocking function will return anyway.
Note that with asynchronous and non-blocking sockets, you may be able to avoid handling timeouts altogether. Your program continues working even while Winsock is busy. So, you can leave it up to the user to cancel an operation that’s taking too long, or just let Winsock’s natural timeout expire rather than taking over this functionality in your code.
your problem is in the while loop that you try to fill buffer
just put an if statement that check last index of every chunks for '\0'
then break your while loop
do {
len = rcv(s,buf,BUF_LEN,0);
for (int i = 0; i <= len; ++i) {
if (buf[i] >= 32 || buf[i] == '\n' || buf[i] == '\r') { //only write valid chars
result += buf[i]; //final string
}
}
if (buf[len] == '\0') { //buf[len] is the last position in the received chanks
break;
}
} while (inner_result > 0);

Indy OnExecute first Read slow

I have the following code in my OnExecute in C++ Builder XE:
void __fastcall Test::TestExecute( TIdContext* AContext )
{
try
{
// get the command directive
DWORD startTime = timeGetTime( );
UnicodeString DBCommand = AContext->Connection->IOHandler->ReadChar();
DWORD endTime = timeGetTime();
UnicodeString log;
log.printf( L"getting command %d ms", endTime - startTime );
Log( log );
...
The log starts at getting command 100 milliseconds and creeps to 300 where it sits for the rest of the application run. I thought that OnExecute was called once data was in the buffer, so why would it take 100 to 300 ms for the first read to succeed?
After this first read in the same OnExecute all other data is read very very quickly (millisecond to sub millisecond).
What could be going wrong?
EDIT:
at method launch: AContext->Connection->IOHandler->InputBuffer->Size is 0. After the first read returns AContext->Connection->IOHandler->InputBuffer->Size contains whats left int he buffer after the read. So this implies that OnExecute is called before any data is actually available to the caller. So the 100-300 ms is the amount of time its taking Indy to fetch the data from the socket and place it in the Buffer after it get notification that data is arriving. That seems way too long.
EDIT:
removed do{ as it was implying a loop that was not there.
The OnExecute event is not tied to the socket buffer at all. TIdTCPServer begins calling OnExecute immeidately after the OnConnect event is called, and continues calling OnExecute in an endless loop until the client disconnects (in other words, you should NOT be looping yourself inside of your OnExecute handler. Read a packet, process, exit, wait for the next event, repeat).
You are correct that the InputBuffer can grow larger than what you are asking for in code. All of the IOHandler's reading methods get their data from the InputBuffer only, not the socket directly. If the InputBuffer does not have enough bytes cached to satisfy a read request, the IOHandler will then wait for bytes to be available on the socket, and will then read all of the bytes into the InputBuffer for later use. This minimizes how often the socket needs to be accessed, and help keep the socket responsive to new data.

Resources