[OPAL] Bug in SIPConnection::OnReceivedSDPMediaDescription
- From: "Guillaume Fraysse" <gfraysse (at) gmail.com>
- Date: Wed, 22 Feb 2006 10:07:34 +0100
Some update for those interested in some stress testing feedback :
I've been runing a test for 4 days and 320000 SIP calls (60
simultaneous calls, each 45 seconds long on average) straight before
getting a segfault, but I'm not sure yet if this come from OPAL/pwlib
or from my platform.
Best regards, and thanks again for this great stack,
Guillaume
On 2/17/06, Guillaume Fraysse <gfraysse (at) gmail.com> wrote:
> Hi Derek,
>
> Well thanks for giving me some feedback. if it all works fine for you,
> I'm gonna run more testings and identify the problem more clearly.
> Unfortunetly I have no easy way to make the problem happen yet, and I
> doubt I can provide one.
> My application ran around a hundred simultaneaous SIP calls for
> several hours without any problem and then segfaulted.
>
> I'll give more feedback after some more testing.
>
> best regards,
> Guillaume
>
> On 2/16/06, Derek Smithies <derek (at) indranet.co.nz> wrote:
> > Hi,
> >
> > I have been doing some stress testing of pwlib, and in particular pthreads.
> >
> > My application was running on a quad cpu box, and would create threads,
> > and destroy threads. At any one time, there could be a hundred threads
> > active.
> > This application also tested the PSafeDictionary components.
> > The linux system on the quad box reported load averages of 50 or so.
> >
> > In total, millions of threads were created/destroyed.
> > Our of this work came a patch for the thread lib, and PQueueChannel.
> >
> > http://cvs.sourceforge.net/viewcvs.py/openh323/pwlib/src/ptlib/unix/tlibthrd.cxx?rev=1.151&view=log
> >
> > http://cvs.sourceforge.net/viewcvs.py/openh323/pwlib/src/ptclib/qchannel.cxx?rev=1.7&view=log
> >
> > My current view is that the thread and qchannel code is correct....
> >
> >
> > I have also seen segfaults from malloc_consolidate, which is deep inside
> > glibc. The current evidence suggests it is either a resource starvation
> > issue when this bug happens, or an issue in FC1.
> >
> > ============
> >
> > If you can describe a way in which I can replicate your faults, I would be
> > happy to "try". Stress testing is good.
> >
> >
> > Derek.
> >
> > On Thu, 16 Feb 2006, Guillaume Fraysse wrote:
> >
> > > Hello,
> > >
> > > I've been stress testing a couple times the CVS HEAD of pwlib and OPAL
> > > and experimenting some stability issues.
> > >
> > > This happens after several thousand calls but seems less stable than
> > > the previous CVS versions, dating from november 2005 that I was using
> > > previously.
> > >
> > > The hic is that I haven't had time, and doesn't have much of it now to
> > > really help on that.
> > >
> > > If that's of any help here is a backtrace of a SIGSEGV I got on my last test :
> > >
> > > [...]
> > > #4 0x40284825 in __pthread_sighandler () from /lib/libpthread.so.0
> > > #5 <signal handler called>
> > > #6 0x4172d7b2 in PAbstractList::GetAt () from
> > > /usr/local/lib/libpt_linux_x86_r.so.1.11.0
> > > #7 0x41731c98 in PAbstractList::GetReferenceAt () from
> > > /usr/local/lib/libpt_linux_x86_r.so.1.11.0
> > > #8 0x408993ca in OpalManager::MakeConnection () from
> > > /usr/local/lib/libopal_linux_x86_r.so.2.3
> > > #9 0x40898be2 in OpalManager::SetUpCall () from
> > > /usr/local/lib/libopal_linux_x86_r.so.2.3
> > > [...]
> > >
> > > I'm wondering what is the state of OPAL and Pwlib, I understand that a
> > > release is coming soon, I'm just wondering if anyone has been
> > > experimenting issues too ?
> > > Stability is a big issue for me.
> > >
> > > This CVS version I use is from february the 6th. Any fix has been done
> > > that looks even remotely like solving my problem ?
> > >
> > > Any information apprecieted,
> > > Best Regards
> > >
> > > On 2/3/06, Guillaume Fraysse <gfraysse (at) gmail.com> wrote:
> > > > Hello everyone,
> > > >
> > > > I've been testing the last release of OPAL that can be found on
> > > > voxgratia.org the 2.1 beta2 and I found what seems to be a bug, or at
> > > > least introduce a big step back for the way I use it.
> > > >
> > > > I'm using an external RTP stack and it seems that in that piece of
> > > > code is the cause of my problem :
> > > >
> > > > SIPConnection::OnReceivedSDPMediaDescription method
> > > > rtpSession = OnUseRTPSession(rtpSessionId, address, localAddress);
> > > > if (rtpSession == NULL) {
> > > > Release(EndedByTransportFail);
> > > > return FALSE;
> > > > }
> > > >
> > > > as well as a similar piece of code in the
> > > > SIPConnection::OnSendSDPMediaDescription method.
> > > >
> > > > the OnUseRTPSession returns NULL if the rtp session is not created,
> > > > even if it's not because of an error and a volunteer act.
> > > > So the call get released...
> > > >
> > > > I'm not sure that's it's something that is wanted.
> > > >
> > > > I made some ugly workaround to simply not release the call and added a
> > > > check on the validity of rtpSession around the call to
> > > > SetRemoteSocketInfo.
> > > >
> > > > I haven't been able to check if the CVS version works better as of now
> > > > the sourceforge CVS web interface is out.
> > > >
> > > > Best regards,
> > > > Guillaume
> > > >
> > > ------------------------------------------------------------------------
> > > Check the FAQ before asking! - http://www.openh323.org/~openh323/fom.cgi
> > > The OpenH323 Project mailing list, using Mailman. To unsubscribe or
> > > change your subscription options, goto
> > > http://www.openh323.org/mailman/listinfo/openh323
> > > Maintained by Quicknet Technologies, Inc - http://www.quicknet.net
> > > ------------------------------------------------------------------------
> > >
> >
> > --
> > Derek Smithies Ph.D. Any fool can write code that
> > IndraNet Technologies Ltd. a computer can understand.
> > Email: derek (at) indranet.co.nz Good programmers write code
> > ph +64 3 365 6485 that humans can understand.
> > Web: http://www.indranet-technologies.com/ Martin Fowler
> >
> > ------------------------------------------------------------------------
> > Check the FAQ before asking! - http://www.openh323.org/~openh323/fom.cgi
> > The OpenH323 Project mailing list, using Mailman. To unsubscribe or
> > change your subscription options, goto
> > http://www.openh323.org/mailman/listinfo/openh323
> > Maintained by Quicknet Technologies, Inc - http://www.quicknet.net
> > ------------------------------------------------------------------------
> >
>
------------------------------------------------------------------------
Check the FAQ before asking! - http://www.openh323.org/~openh323/fom.cgi
The OpenH323 Project mailing list, using Mailman. To unsubscribe or
change your subscription options, goto
http://www.openh323.org/mailman/listinfo/openh323
Maintained by Quicknet Technologies, Inc - http://www.quicknet.net
------------------------------------------------------------------------