Where to begin(I'll try to be quick and can provide any log snippets);I finished the quarterly WIM for production and noticed that replication took 35 hours instead of the usual 10. Watching the logs i could see that the package was replicating but the status messages that used to get logged were non existant. The last WIM I replicated for our WES image created 15k status messages and all though is embedded is still close to the same size, also networking took a look at the traffic, it was almost zip aside from this transfer. This WIM when it completed only logged 12 status messages. But that is just the start. Every day when i come in I check the inboxmon.log and resource mon, on our central to make sure nothing is running aground. This last month I've had to bounce the entire environment, 2 primaries and 1 central because of a backlog of ddr records that stopped processing all together. Before I could even get to the bottom of those issue hinv went off a cliff today, and this time a simple restart didn't fix it.
Let me say that Im doing my best, currently reading 07 unleashed, and am subscribed to rod trents vids on BDNA. Our environment is considerable approx 20k devices with 280 secondaries. I came on board 7 months ago to handle OSD and quickly got thrown into the mix with all SCCM admin. duties, the only other person here only does software packaging and didn't know what inboxes were until I brought them to his attention. I'm still a LOOOOOOOng way from a novice let alone proficient. So please go easy on me here. This is my first post.
As stated in the title the pressing issue it seems to me would be the insane amount of MIF files that are not getting processed. I followed the creation(after fixing WMI) of the root\ccm\policy namespace, xml file, and verified that it went off to the MP. This one machine that I'm using as a test is in our HQ so goes straight to one of the primaries. Where the problem is occuring is the MIF outbox\hinv.box. There is currently 90k plus files in there and the sms_inventory_dataloader has been screaming, along with a few other components(note:its not a handful of hosts flooding they are all different machinenames). I'm really afraid the wheels are coming off this car and I can't do anything to stop it. I've been working with a MS rep, had them transfer me to a different one today because the last guy wasn't telling me anything this forum and 5 minutes of searching couldn't tell me.
With that said I've been searching and before you blast me with UTFSE I have, for a week now. I've gotten our IUSR account issues resovled among other things(were native mode but no longer have any ibcm clients). The system status in the console had more red in it than the lobby of the shining. I've been going over flow charts, technet, and google for almost 3 days straight now and things just seem to be getting worse. The best threads I found relating to this particular issue is this
. If that is correct then that means I have potentially 100s if not 1000s of clients that have either become corrupt or been deleted from the DB, and that would mean an INSANE amount of work. Am I looking in the right direction?
I hold tremendous respect for this forum, it is the goto place for sccm information. I wouldn't be posting this question if there wasn't some urgency to the situation. Yes there is still plenty of disk space on the servers, thats about the only green areas i have left. Starting with the sms_inventory_data_loader, the mifs are mostly warning about being deltas and needing to run full hinv, but we have not had a large influx of new devices. No more than the norm i.e. 10 to 20 a week, and nothing to spike 1000's of records and the heart beat is set for the default. Any help other than get bent would be very much appreciated. Thanks in advance and sorry this wasn't shorter, i just didn't know where to start.
<message edited by keepreading on Monday, January 20, 2014 10:55 PM>
I've done alot of searching and today marked the conclusion of my 2 day working with MS tech's over the phone. The issue started because of really slow replication to a site for osd. That is not the issue that has me writing this post. There is a larger issue at hand but it is alluding me, and so far the two MS techs that I've worked with have gotten me no where.