This project is read-only.

odd bug in RasCollection

Aug 10, 2009 at 8:17 PM
Edited Aug 10, 2009 at 8:20 PM

I had my code working great for many weeks. Now due to device issues I am asked to implement support for another device in our project (something that has nothing to do with RAS). I had assumed that the RAS part of the code would remain stable, but now we have a problem: the creation of the phonebook entries related to that device somehow seems to have broken the DotRas library. I as usual declare

Dim pbk as New RasPhoneBook

Doing so produces an TypeLoadException in the pbk.Entries.Count field. The text for the error is "The generic type 'DotRas,Design.RasCollection `1' was used with the wrong number of generic arguments in assembly DotRas, Version 1.1.3488.31105, Culture=neutral, PublicKeyToken=b378f04384b7892a'.":"DotRas.Design.RasCollection `1"

I should add that this entry uses a long name for its entry: WirelessManagerRasConnection, and also has a long device name (Sony Ericsson MD 400g Mobile Broadband Modem) in case that matters.

Aug 11, 2009 at 2:21 AM
Edited Aug 11, 2009 at 2:24 AM

How exactly was the code used? After reviewing the code it doesn't look like there is anything physically wrong with it, it might be something I've done that .NET doesn't like, or you might be forgetting something.

How often does it happen?

Is it only with that particular device?

What happens if the entry name is different? (Though that shouldn't matter, the length is within naming limits).

 

I've done a bit of research on the web, and this seems a bit more common with generics than expected.

Aug 11, 2009 at 2:05 PM
Edited Aug 11, 2009 at 2:36 PM

The code that produces the exception is simply the dim statement I posted earlier. I take this to mean that the exception is in the constructor or something it calls, then. Somehow.

It happens all the time when and only when (I have a machine I haven't installed the new device's stuff and hence phonebook entries on yet and this part of my code has been stable for quite a while now) the new device has its entries created. I renamed the entry name to "short" as a way of testing the length hypothesis more. No change.

As for the generics issue: Uh oh, does this suggest a synergy bug in .NET or something unfortunate? I hate those things.

 

Update: It seems that perhaps the entry itself is somehow buggy. I thought I'd cleared this up by getting a new sim card, but it seems not. If I try to dial manually using Windows on that entry, it fails to work with an error 734. Jeff, does it seem plausible to you that one out of spec collection of data in the phonebook could be producing these errors? Could they be the same bug in two different manifestations? I might be able to get permission to send along a sanitized version of our phonebook for you to examine if that would help. I note in passing also that the "manager software" that ships with this device does not use the RAS phonebook at all (or at least open the entries it has); it makes some sort of pseudoLAN type connection. I have contacted the provider about this matter as well.

Aug 11, 2009 at 2:56 PM

The constructor of the RasPhoneBook does nothing other than initialize the internal FileSystemWatcher component used to monitor external changes to the pbk file. My guess is, the error is occuring during type initialization, not during the construction of the object. Type t = typeof(RasPhoneBook); should produce the same error. From what I've seen online about the issue it might be a problem with the key mechanism I made for the RasEntryCollection, but all I did was use another .NET collection internally to store those keys for lookup. Is the error occuring during design time, while debugging, or runtime?

It would definitely help me diagnose the issue if I had a copy of the phonebook file producing the error. If you can get it sent that'd be great, if not we can continue this discussion on here till the issue is resolved.

Error code 734 is ERROR_PPP_LCP_TERMINATED: The PPP link control protocol was terminated. What would cause that error, I haven't got a clue.

Aug 11, 2009 at 3:33 PM

Uh oh. I have discovered by examining the phonebook that indeed it looks corrupt after all. To the point where I do not feel I can paste it as is. Here's the stuff that worries me.  I've replaced control characters (!) and what not with descriptions in parens. All the other entries look fine, though it looks like perhaps the file as a whole becomes corrupt because of this (after all, I haven't even selected an entry when the error occurs.)

[WirelessManagerRasConnection]
Encoding=1
Type=0
AutoLogon=0
UseRasCredentials=1
DialParamsUID=150562
Guid=C89556D76369D04F869480F4395BBE7F
BaseProtocol=1
VpnStrategy=0
ExcludedProtocols=3
LcpExtensions=1
DataEncryption=0
SwCompression=1
NegotiateMultilinkAlways=0
SkipNwcWarning=0
SkipDownLevelDialog=0
SkipDoubleDialDialog=0
DialMode=0
DialPercent=0
DialSeconds=0
HangUpPercent=0
HangUpSeconds=0
OverridePref=15
RedialAttempts=3
RedialSeconds=60
IdleDisconnectSeconds=0
RedialOnLinkFailure=0
CallbackMode=0
CustomDialDll=
CustomDialFunc=
CustomRasDialDll=
AuthenticateServer=0
ShareMsFilePrint=0
BindMsNetClient=0
SharedPhoneNumbers=0
GlobalDeviceSettings=0
PrerequisiteEntry=
PrerequisitePbk=
PreferredPort=COM30
PreferredDevice=Sony Ericsson MD400g Mobile Broadband Modem
PreferredBps=7200000
PreferredHwFlow=1
PreferredProtocol=1
PreferredCompression=1
PreferredSpeaker=1
PreferredMdmProtocol=0
PreviewUserPw=1
PreviewDomain=0
PreviewPhoneNumber=0
ShowDialingProgress=1
ShowMonitorIconInTaskBar=1
CustomAuthKey=-1
AuthRestrictions=888
TypicalAuth=1
IpPrioritizeRemote=1
IpHeaderCompression=1
IpAddress=0.0.0.0
IpDnsAddress=0.0.0.0
IpDns2Address=0.0.0.0
IpWinsAddress=0.0.0.0
IpWins2Address=0.0.0.0
IpAssign=1
IpNameAssign=1
IpFrameSize=0
IpDnsFlags=0
IpNBTFlags=1
TcpWindowSize=0
UseFlags=0
IpSecFlags=0
IpDnsSuffix=

NETCOMPONENTS=
ms_msclient=0
ms_server=0
ms_psched=1

MEDIA=serial
Port=COM30
Device=Sony Ericsson MD400g Mobile Broadband Modem
ConnectBPS=7200000

DEVICE=modem
PhoneNumber=*99***1#
AreaCode=
CountryCode=0
CountryID=0
UseDialingRules=0
Comment=
PhoneNumber=(control f deleted)
AreaCode=
CountryCode=0
CountryID=0
UseDialingRules=0
Comment=
PhoneNumber=(other list of unprintable and non-ASCII characters)
AreaCode=
CountryCode=0
CountryID=0
UseDialingRules=0
Comment=
LastSelectedPhone=0
PromoteAlternates=0
TryNextAlternateOnFail=1
HwFlowControl=1
Protocol=1
Compression=1
Speaker=1
MdmProtocol=0

Aug 11, 2009 at 4:22 PM
Edited Aug 11, 2009 at 4:28 PM

Have you tried creating a backup of the pbk file and removing the corrupt entry? I glanced over the InitializeComponent method of RasPhoneBook, and it doesn't even set the path used by the watcher until the phone book has been opened, so there is no way a race issue could have corrupted anything.

Edit: Maybe it has something to do with how VB handles the Dim pbk As New RasPhoneBook statement. What happens if you change it to Dim pbk As RasPhoneBook = New RasPhoneBook? Trying to figure out how the phonebook file would have became corrupted in the first place. It might be something dealing with ASCII / Unicode.

Aug 11, 2009 at 5:12 PM

Looks like removing it does allow the other devices to function, yes. I have passed this information on to the supplier. I doubt I'll need further help on this side. Though it might be useful, Jeff, if possible, to sanity check the phonebook prior to using it to create your objects. On the other hand, you'd think the RAS API itself would do this ...

Aug 11, 2009 at 6:46 PM
Edited Aug 11, 2009 at 7:19 PM

That would be good, however since there is no API call in RAS to check the validity of a pbk file I'd need internal file details in DotRas which could change at any time. Long term maintenance of such functionality would be more headache than it's worth. I could change the exception being thrown while loading the entries if you think that would help any. Maybe have something more descriptive of the problem, and have the internal exception contain the actual error thrown.

Any thoughts on how the file could have been corrupted? The only place in the project that could corrupt the file is the RasHelper class, and all of the APIs I hook to use the unicode entry points to prevent any ascii/unicode problems.

Aug 11, 2009 at 7:01 PM

Oh, ok. I agree it sounds too difficult.

As for how it got corrupted, it looks like the installation of the entry corresponding to the new device did it. These cellular things push phonebook entries as part of their installation and it looks like this latest one pushed bogus data for some reason. I'm still waiting on the reseller/carrier to get back to me on this. I am pretty convinced that this is nothing you and I can do anything about in our respective code.

Aug 11, 2009 at 7:18 PM

I think I know how the file got corrupted. It has to deal with RAS, modems, and entries that use alternate phone numbers when using DotRas. It's related to that blurb I have on the discussions tab about ensuring you're using the proper version of DotRas for each platform you plan to use the software on if alternate addresses will be used. If you look at the entry you can clearly see 3 distinct phone numbers, 1 for the primary and 2 alternates.

Here's the reason behind the bug: Microsoft in all their wisdom decided to use sizeof(RASENTRY) rather than the size of the structure passed into the API to determine where alternate entries need to be written at in memory. Since DotRas is compatible with multiple platforms, if you're using the WIN2K build on a Windows XP machine the offset will be outside of where Windows expects the data to be at (they expect it at 5656 and the size you're using is only 4048). So they start writing the data at offset 5656 past the start of the struct but I didn't allocate it so it's already in use. Basically, DotRas causes Windows to overwrite a random chunk of memory if alternate entries are used. I already reported the bug to Microsoft, and they have fixed it, I just don't know when that patch will be released. You can see the link on the discussions tab for more information about this phenomenon.

Workaround: You will need to compile one version of your application for each version of Windows you plan to support. I know it sucks, but that's all I can do short of hardcoding all the struct sizes for each platform in the project. If this is out of the question, I can hard code the struct sizes in the project.

Aug 11, 2009 at 7:47 PM

That seems plausible, except for one thing. I only target WinXP SP2 and only run the application on such a platform (including the build machine).

Is it possible that the Sony Ericsson device's install does the same sort of thing? I'll check something else out here too ...

Aug 11, 2009 at 9:24 PM

Easy way to test would be:

Reinstall the software that puts the connection in the phonebook. Once that's done, right click on the entry (in Network Connections) and go to properties. Click the alternates button. The alternate entries listed in there should be properly formatted and look normal. If they look funky or something breaks when you try to load the dialog, the installation is the problem.

Make sure you don't open the phonebook using DotRas before this point.

Assuming the alternate phone numbers look correct in Windows, try opening the phonebook up with DotRas and check the RasEntry.AlternatePhoneNumbers property. The numbers you saw in the network connections window should all be listed there. If for some reason they do not match, you'll know DotRas is causing the issue. If you're only targetting XP SP2 you should only need to worry about referencing the WINXP or WINXPSP2 builds. Either of those builds should work fine with what you're doing. According to the version you specified in your original post, you're working with the WINXPSP2 build (indicated by the revision number of the assembly).

Aug 12, 2009 at 3:39 PM

Looks like I have hit one big synergy bug here. Fortunately, Jeff, I have completely exonerated DotRas for the time being since I see the other issues without even touching it.

I thought I'd mention this stuff here just in case someone else encounters it or has any ideas as to what is really going on.

It seems that this particular device only installs its phone book entry when the connection manager software is started, not when the device is set up. It seems to create a "fall back" interface as part of this entry. Of course, I don't understand (but will be talking to the manufacturer later) why this is so; surely if one was broken the other would be. No phone number is present for the other, at least visibly in Explorer. In the phonebook is the "corrupt" entry, so I suspect Explorer's doing some sort of sanity checking or the like. This, when first set up, works fine. Also, any other duplications of it also work, manually and otherwise, and without the fallback subentry.

Now, here's where it gets weird and instills a sense of deja vu. When the device is removed, as usual Windows falls the phonebook entries back to the built in modem as that is still present. So far so "normal". The problem arises when the device is then reinserted and Windows then falls the entries forward. Here somehow there's a slight change: the changes are not inverses of one another. Subsequently this breaks something and the entire structure is out of spec then or the like(nothing at least with that device dials, DotRas gets that error - I think - still investigating), etc. Launching the connection manager "fixes" the problem as it seems (Sysinternals FileMon reports writes) to rewrite the phonebook entry corresponding to the device etc. (And without reading it from its USB storage, too, as I turned that off for some other testing.) Why this "fixes" other entries also using that device I don't know.

What seems to be different in this case is the fact that the entry has this "alternate device" from the get go, and also seems to want to create an entry with the conventional built in modem and also the ISDN adapter device (which is even weirder, since I don't have one installed).

It looks like, then, that we're seeing a "synergy bug" between the Windows "feature" we discussed a while back and the entry that this device is installing. I suspect we'll get finger pointing in both directions on this issue, but I'll make some further remarks if necessary after I speak to the manufacturer.

Aug 12, 2009 at 6:43 PM

If there's anything I can do, don't hesitate to ask.