Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search
 

WilliamPitt

(58,179 posts)
Tue Oct 22, 2013, 05:10 PM Oct 2013

"Legacy Computer Systems": Interesting take on the ACA rollout problems

Via Josh Marshall at TPM:

Misunderstanding the Problem?

Are we not grasping the nature of the problem itself? TPM Reader ST says the issue isn't so much the website as legacy computer systems throughout the federal bureaucracy and the need to stitch them all together until a single interface.

From a TPM reader:

(snip)

The Healthcare.gov site itself is just like a server in a restaurant. The server may be the main point of interaction you have -- bringing you menus, taking your order, and bringing you food - but without the kitchen, there's no meal. And yet when a kitchen messes up and can't get food out, the server often unfairly gets blamed. And it doesn't matter if you have the best waiter in town if the kitchen can't get its act together.

Healthcare.gov is basically just showing you your menu of insurance options, taking your order for insurance, and bringing everything back to you when the order is complete. In tech terms, it's just the front end. All the heavy lifting takes place on the back end, when the website passes your data to an extremely complex array of systems that span multiple agencies (like so many cooks in a kitchen). A central processing hub needs to get data from each of these systems to successfully serve a user and sign up for insurance. And if one of these systems -- several of which are very old in IT terms-- has a glitch and can't complete the task, the entire operation fails for that user. Only if everything works perfectly, and the data gets passed back to the website, does the user have a good experience with Healthcare.gov.

The problem is that throwing more capacity at the website itself, or praising or criticizing how it was built, is as useless as criticizing a server when it's the kitchen that messed up. Maybe cathartic, but not much else.

The complexity involved in making all these systems work together is tremendous. Reader RN doubted that there are 500 million lines of code involved, but if you add up what originally went into building 10 or so huge systems, across multiple agencies, plus all the stuff to make them work together, 500 million lines of code might be realistic. (Especially as many of these systems are old and have been patched and built onto many times.)


The rest: http://talkingpointsmemo.com/edblog/misunderstanding-the-problem

Thoughts?

78 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
"Legacy Computer Systems": Interesting take on the ACA rollout problems (Original Post) WilliamPitt Oct 2013 OP
which begs the question as to why this data isn't ported over to data warehouses once a day Pretzel_Warrior Oct 2013 #1
As a mainframe (Legacy) programmer who sends data HERVEPA Oct 2013 #2
Folded, spindled and mutilated indeed.. Fumesucker Oct 2013 #16
The problem is both B2G Oct 2013 #3
what is so utterly complext about the web front end coding? not sure you Pretzel_Warrior Oct 2013 #14
Nothing is complex about it. That's my point. B2G Oct 2013 #23
ok. I admit that wasn't a fair assessment of your knowledge base. just seems the interfaces Pretzel_Warrior Oct 2013 #24
Thanks PW. I hope so too, but B2G Oct 2013 #26
Some servers in the Federal Bureaucracy still use COBOL nadinbrzezinski Oct 2013 #4
lots of them still do... VanillaRhapsody Oct 2013 #6
There's nothing wrong with COBOL and Mainframes.... kiawah Oct 2013 #9
And people who do this for a living, have pointed out nadinbrzezinski Oct 2013 #11
Agreed, but the problem is that these developers B2G Oct 2013 #22
I know what you mean.... I'm part of that dying breed. kiawah Oct 2013 #28
And I hope you're well compensated B2G Oct 2013 #29
Doing OK.... But if you have something better, I always have my eyes open! kiawah Oct 2013 #31
You really willing to hire mainframers? Ohio Joe Oct 2013 #66
Will do. B2G Oct 2013 #67
There is nothing wrong with a DMS-500/250 either, except it's going EOL snooper2 Oct 2013 #47
The problem appears to be the subsidy FarCenter Oct 2013 #5
Yep. And the VA, and Chip, and on and on... B2G Oct 2013 #8
Makes perfect sense. jazzimov Oct 2013 #7
Few here want a serious discussion of the issues Will B2G Oct 2013 #10
Can't fix it WilliamPitt Oct 2013 #13
Agree. And the code review alone B2G Oct 2013 #17
sounds like people are discussing it seriously. Pretzel_Warrior Oct 2013 #18
On this thread, yes B2G Oct 2013 #20
I am really enjoying your take and comments about these issues. pangaia Oct 2013 #32
Change management is going to be a total pain n/t SamYeager Oct 2013 #12
When you say legacy, I think IBM AS400 notadmblnd Oct 2013 #15
I Know RobinA Oct 2013 #49
They've been downsizing and letting older, more experienced, high paid workers go. notadmblnd Oct 2013 #74
Well, there's a stimulus jobs bill right there - Overhaul the government's computer systems. haele Oct 2013 #19
Um OK. We'll get right on that. B2G Oct 2013 #21
Scanning is for paper records, D'oh. The IT/Computer/Systems Engineering grads will overhaul. haele Oct 2013 #27
Govt. IT, the power grid, roads and bridges Mopar151 Oct 2013 #25
Bring home the troops. pangaia Oct 2013 #33
We need to restore a lot of functions to the military Mopar151 Oct 2013 #34
a tad off-topic but today I got as far as the "Review Eligibility" screen steve2470 Oct 2013 #30
I don't trust anything that doesn't give me any specifics; it's usually fluff or Zorra Oct 2013 #35
The first rule of thumb for systems like this... ljm2002 Oct 2013 #36
Very true. Why design a system to get from A to B, when you can design a system Buns_of_Fire Oct 2013 #38
"TMP reader ST" has clearly never developed enterprise-class web infrastructure. Xithras Oct 2013 #37
"Graceful failures" is the key here. Buns_of_Fire Oct 2013 #39
I agree completely. Xithras Oct 2013 #41
I also understand that they outsourced a great deal of the programming B2G Oct 2013 #42
I tend to be on the fence about that one. Xithras Oct 2013 #44
My experience is that they do a fairly decent job, IF B2G Oct 2013 #46
it depends on how much the company feels it has to sell you hollysmom Oct 2013 #61
Part of the problem there is that Indian universities tend to be very strong on theory, winter is coming Oct 2013 #68
The other thing that the Indian programmers told me as we became friends hollysmom Oct 2013 #72
Error-handling stuff can be extremely complicated. MineralMan Oct 2013 #55
I always approached it as a game. Me vs. The User. Buns_of_Fire Oct 2013 #75
Yup. MineralMan Oct 2013 #76
I think a large part of the problem Egnever Oct 2013 #40
Doesn't matter Xithras Oct 2013 #43
They were given an impossible, unrealistic timeline B2G Oct 2013 #45
In which case, it's still the contractors fault. Xithras Oct 2013 #58
it seems the estimates are no being made by business majors. hollysmom Oct 2013 #63
That's nothing new. We used to call it "management by catastrophics". winter is coming Oct 2013 #69
There were about 5 dozen contractors that took the half billion FarCenter Oct 2013 #70
So you had 3 people full time for three months Egnever Oct 2013 #48
How exactly were they undermined? B2G Oct 2013 #50
Here's just one way Egnever Oct 2013 #52
DDOS attacks? Really? Xithras Oct 2013 #56
The original cost was estimated at around 93 million B2G Oct 2013 #57
QA is 100% the responsibility of the contractor. Xithras Oct 2013 #60
HealthCare.gov builders saw red flags FarCenter Oct 2013 #71
I'm not real IT savvy, but the analogy makes sense. Restaurants I do know. pinto Oct 2013 #51
The problem is every single one of the governments systems has to be able to handle all influx. dkf Oct 2013 #53
Yes, I suppose, but that's not where the main problems are BlueStreak Oct 2013 #62
This is an unacceptable cluster-fuck and the Obama Administration owns it. I want to rip bluestate10 Oct 2013 #54
Well, no. BlueStreak Oct 2013 #59
A lot is gained by looking backwards hollysmom Oct 2013 #64
I agree, but we're really talking about the whole government procurement process BlueStreak Oct 2013 #65
Government & Computer = Oil & Water DeSwiss Oct 2013 #73
S'truth. riqster Oct 2013 #77
Seems some would agree with you jazzimov Oct 2013 #78
 

Pretzel_Warrior

(8,361 posts)
1. which begs the question as to why this data isn't ported over to data warehouses once a day
Tue Oct 22, 2013, 05:12 PM
Oct 2013

instead of transactionally interacting with legacy systems for all users.

 

HERVEPA

(6,107 posts)
2. As a mainframe (Legacy) programmer who sends data
Tue Oct 22, 2013, 05:15 PM
Oct 2013

to a data warehouse, I've seen that it often gets mangled and distorted in said warehouse.

 

B2G

(9,766 posts)
3. The problem is both
Tue Oct 22, 2013, 05:17 PM
Oct 2013

New web front end as well as the interfaces to legacy systems.

It's a bloody mess.

 

Pretzel_Warrior

(8,361 posts)
14. what is so utterly complext about the web front end coding? not sure you
Tue Oct 22, 2013, 06:04 PM
Oct 2013

know what you're talking about. It's always the API's, javascript, etc. needed to access disparate data sources.

 

B2G

(9,766 posts)
23. Nothing is complex about it. That's my point.
Tue Oct 22, 2013, 06:13 PM
Oct 2013

It's obviously very buggy and it they didn't get that right, what kind of mess was made with mainframe interfaces??

And as far as me 'not knowing what I'm talking about', get back to me after you've managed large IT projects for 20 years.

 

Pretzel_Warrior

(8,361 posts)
24. ok. I admit that wasn't a fair assessment of your knowledge base. just seems the interfaces
Tue Oct 22, 2013, 06:17 PM
Oct 2013

passing data from the web front end user inputs to the ACA website and then data calls to the other databases are what created huge snafus.

Oh well, I am sure they are working like crazy to do autopsy and put in code rewrites to get this thing running at 99% as soon as possible with numerous software updates over many multiple weekends.

 

B2G

(9,766 posts)
26. Thanks PW. I hope so too, but
Tue Oct 22, 2013, 06:24 PM
Oct 2013

my experience tells me they have their work cut out for them.

None of the experts they are 'parachuting in' are going to agree to arbitrary fix dates. Not if they have a functioning neuron. They will need time to review the code, assess the situation and put a plan in place. I know no one wants to hear this, but based on my experience, they will have to delay the mandate and most likely pull the website for a period of time.

That length of time could be extensive in political terms. I hope I'm wrong, but I fear I'm not.

 

nadinbrzezinski

(154,021 posts)
4. Some servers in the Federal Bureaucracy still use COBOL
Tue Oct 22, 2013, 05:21 PM
Oct 2013

HTML 5 compliant sites have issues playing well with COBOL

 

kiawah

(64 posts)
9. There's nothing wrong with COBOL and Mainframes....
Tue Oct 22, 2013, 05:55 PM
Oct 2013

Most big data crunching business still use it (banks, insurance companies, etc.). It's very stable and does its job well....

 

nadinbrzezinski

(154,021 posts)
11. And people who do this for a living, have pointed out
Tue Oct 22, 2013, 05:57 PM
Oct 2013

that lack of beta testing, as in extensive, and legacy computer language is leading to these issues. Chiefly, as mentioned yesterday in the NPR story, COBOL has issues talking with HTML5. It is not a matter of stability, but communications.

(As they put it, it is in the translation program)

 

B2G

(9,766 posts)
22. Agreed, but the problem is that these developers
Tue Oct 22, 2013, 06:12 PM
Oct 2013

are literally a dying breed. It's very hard to find mainframe coders anymore. Very. I know, I try.

 

kiawah

(64 posts)
28. I know what you mean.... I'm part of that dying breed.
Tue Oct 22, 2013, 06:43 PM
Oct 2013

I was the youngest Cobol programmer in my office 25 years ago, and I still am today.

Ohio Joe

(21,756 posts)
66. You really willing to hire mainframers?
Wed Oct 23, 2013, 02:36 PM
Oct 2013

PM me. I have over 20 years and there is little to nothing I can't do on a mainframe. I'm currently in Denver but can relocate anywhere.

 

snooper2

(30,151 posts)
47. There is nothing wrong with a DMS-500/250 either, except it's going EOL
Wed Oct 23, 2013, 12:50 PM
Oct 2013

Wonder how COBOL based CRM systems will play with WebRTC...

Oh wait, it won't LOL



And why did we transition away from using CORBA again

 

FarCenter

(19,429 posts)
5. The problem appears to be the subsidy
Tue Oct 22, 2013, 05:27 PM
Oct 2013

If the site only had to present the user with a menu of insurance companies and their policies available in a given state, it wouldn't have to interface with as many other systems.

In order to accurately compute the subsidy for each user, it has to verify income with IRS, validate identity with Experian, check SSN with SSA, etc.

 

B2G

(9,766 posts)
8. Yep. And the VA, and Chip, and on and on...
Tue Oct 22, 2013, 05:37 PM
Oct 2013

It's critical functionality. The exchange can't work without it.

I've been pointing to the interfaces for 3 weeks now. My concern has, and continues to be that there are so many issues with the front end, that all of the interfaces haven't even been excersised yet.





jazzimov

(1,456 posts)
7. Makes perfect sense.
Tue Oct 22, 2013, 05:36 PM
Oct 2013

The website has to be able to talk to multiple different databases which are tailored to multiple types of OS. To further the restaurant analogy, it's like having multiple parts of the kitchen that only speak one language, which means that the server has to be able to speak multiple different languages. And if there is a miscommunication with any of them, then the whole order is rejected.

 

B2G

(9,766 posts)
10. Few here want a serious discussion of the issues Will
Tue Oct 22, 2013, 05:56 PM
Oct 2013

We'll just be branded as trolls.

Sad state of affairs.

 

B2G

(9,766 posts)
17. Agree. And the code review alone
Tue Oct 22, 2013, 06:04 PM
Oct 2013

with all of these new 'experts' decending will take a month, minimum.

Probably longer. But hey, I'm trying to be optimistic.

Then they actually have to fix, test and deploy. Scary is an understatement.

But what do I know? I'm just a right wing shill...

pangaia

(24,324 posts)
32. I am really enjoying your take and comments about these issues.
Tue Oct 22, 2013, 07:35 PM
Oct 2013

Don't quit.
I am a musician. I don't even know how to change a pdf into html, get the picture? But I can sure tell Mozart from Haydn.
So I follow everything I can, trying to understand who screwed up.

Listening to NPR ( I know, I know) on the way home from work today the guest was some IT guy ( you probably heard it also). And not the only IT expert I have heard, of course. He seemed to make a lot of sense...
He spoke of communicating like you do, of too many programmers to begin with, and unbelievably complicated problem to solve, that they maybe should have finished one part of the system first- say get the part up first where one could just look for the options available w/o registering..then, several months later, get another step on line, etc, etc.
He also said to bring in new 'experts' now could slow a fix down even more because the current geniuses would have to take time off to teach the newbies what had already been done, the newbies would have to study it all.

With my near-zero knowledge of computers (I DO know how to get rid of cookies and clean out the cache) I have to just listen and use my common sense. My common sense tells me this was a huge fuck up, and as you surmise, the fix may be a while in coming--giving repubs a chance to lick their chops in glee and attack and try to destroy even more. Hope I am wrong.

notadmblnd

(23,720 posts)
15. When you say legacy, I think IBM AS400
Tue Oct 22, 2013, 06:04 PM
Oct 2013

They are dinosaurs, but they're pretty reliable dinosaurs. But having them interface with a rack full of Dell or Sun servers can/will take some time. The files from the servers probably need to be sent over to the dino and depending how big the file(s) are, it could take some time. Then, there's probably a human on the other end that is supposed to monitor the file transfers successful completion, which at that point will need to be processed on the dinosaur via a batch job which may be automatic- or not. Now if there's no one in the control room who has a technical understanding of what is supposed to be going on, mayhem may ensue when for whatever reason the file doesn't make it over to the mainframe.

When I left the IT services company that I worked at for nearly 30 years, the trend was to hire young college students on the cheap. It was my experience working with many of them that they are immature, unreliable and just not interested in gaining an understanding of the processes that they are monitoring and running. They' are happy to follow a script and in their spare time, they either play with each other or on the internet. When a problem arises, either they are unable to resolve it on their own or are required to implement huge bureaucratic process that takes hours or days to get to the end resolution.

RobinA

(9,893 posts)
49. I Know
Wed Oct 23, 2013, 12:56 PM
Oct 2013

next to nothing about IT, but I've worked for a living for 33 years now and I'd be willing to bet that the root of this whole problem can be summed up buy the word "cheap." When it comes to the knowledge base as well as the infrastructure in every field I've worked in, we've been eating our young since at least the '80's. This is the result. Hard to believe we put a man on the moon once.

notadmblnd

(23,720 posts)
74. They've been downsizing and letting older, more experienced, high paid workers go.
Wed Oct 23, 2013, 04:01 PM
Oct 2013

Back when I was still working in IT, they were gung ho in regards to ISO standards which created a huge bureaucratic, time wasting process (es).


For instance, the environment went from me being able to call a sysadmin and have them reset a password, to first opening a trouble ticket, then calling the help desk to let them know there was a trouble ticket, waiting for the help desk contact the sys admin. Then waiting for the admin to reset the password, call the helpdesk back and then wait for the helpdesk to notify me.

Things like that went from being a two minute resolve time, to sometimes hours or days depending on whether or not the person was at their desk, or at lunch or in a meeting or called off for the day. So if they're handling all their problems in a manner such as I described, yeah, it's mayhem getting it off the ground and running smoothly.

haele

(12,659 posts)
19. Well, there's a stimulus jobs bill right there - Overhaul the government's computer systems.
Tue Oct 22, 2013, 06:06 PM
Oct 2013

And not as a porkbarrel to contractors - a WPA sort of program, that state governments can also participate in.

Are you a high school grad who wants a scholarship in IT/Computer Engineering, or even an older grad or tech who wants your Cloud/IT/IT Security certifications upgraded for free? The government will pay for it if you agree afterward graduation to work for three years as a GS-11/13 ($40K to $50K a year over the next three years) at a government facility site upgrading and administrating/debugging their new or modified systems.

You want your PHD in Network IT/IT management? We'll pay for graduate school if you work for four years working out the architecture, developing the new systems, or managing the installation and implementation.

You just want two years of college getting a liberal arts or general science degree, and need work experience along with that to parlay that into a livable wage job in the private sector? Or tried for the above IT/Computer degree and could only get a general associates because you're not really that mathmatically inclined or changed your mind about what your major was going to be?

Here's comfy chair and table, a scanner, an all-in-one computer hooked up to the government server and loaded with Adobe Acrobat Professional and/or an official Federal forms program, and a room full of files from 1820 into the mid 1990's. You owe us two full-time years, and we'll cover that two years for up to $40K a year and government bennies, depending on the COLA for where you're located after the first two years of college.

Get scanning! You can be working any number of locations with any number of Federal, State, or County records. Let's make digital copies of all these disintegrating paper records and bring them up to date.
Let's have PDFs available online with keywords for all documents and put data from records that can still be modified over time or need to be tracked (like ongoing medical records of living people or USGS records that track environmental ) into active documents that will continue to be accessed by government programs.

One can still continue to go to school for the next two years as an undergrad while they're working that sort of job - lots of people work full time and continue on for their BA or BS.

A project like this can potentially employ up to 100K Americans across the nation - rural and urban, because the majority of them would be working "on site". This will be sort of like the WPA and CCC projects during the Great Depression. Pay people a living wage - not a great wage, but a living wage, help them support their families, give them an educational boost, and bring Government records and programs out of the 20th century and baseline this information to (at least) 2014 digital standards.

It may cost a bit to implement, but it will save money in the long run. It's an investment.

Haele

 

B2G

(9,766 posts)
21. Um OK. We'll get right on that.
Tue Oct 22, 2013, 06:10 PM
Oct 2013

Scanning? Really?

We're going to overhaul the fed's computer systems with scanners?

haele

(12,659 posts)
27. Scanning is for paper records, D'oh. The IT/Computer/Systems Engineering grads will overhaul.
Tue Oct 22, 2013, 06:41 PM
Oct 2013

The low-IT skill labor does the scanning. That's the way it's done at hospitals, universities, corporations, etc...
I know it's a "make-work" project that sounds stupid and wasteful, but even just considering the backlog at the VA, and the mess contractors make getting involved with the patchwork contracts to upgrade various unique systems, the government has to get a handle with oversight.

Scanning is part of the process. You don't need a graduate degree or make the effort to re-create every single word from the soil acidity report of a BLM investigator in 1960, or from a tattered readiness report from a Marine Corps training during WWII, do you?
An official electronic copy (not the Mormon church copy they "allow" outsiders to access) of the census record from a Nebraska territory backwater in 1850, or recording 1930's birth certificates from a county seat in Oregon are also important. So, the AS/AA or certificate seekers can trade two years of paid education for two years of scanning documents and data input to catch the government up.

Those records are important, just as important as leveraging additional training and education to get professionals involved with developing an actual integrated Federal computer system, that is able to create a standard that other government systems can interact with.

The problem upgrading a system as large as the Federal Government is that everyone wants to do it their own way. So someone has to take charge and pull all the strings together, and unless you want IBM, Google, or Booz Allen Hamilton to be in charge of project - and co-opting the project, because that's what happens when you have private industry managing government projects - Federal computers and records, you need to have a not-for-profit government entity such as the GSA fully in charge.

Haele

Mopar151

(9,983 posts)
25. Govt. IT, the power grid, roads and bridges
Tue Oct 22, 2013, 06:22 PM
Oct 2013

All need fixin', would be a good financial investment, would give a leg up to a lot of folks who need it, The old WPA acronym would be quite fitting, updated:

Work. Progress. America.

pangaia

(24,324 posts)
33. Bring home the troops.
Tue Oct 22, 2013, 07:39 PM
Oct 2013

Give them all, or those who want it, training in, IT engineering, brick laying, waste management, whatever and fix the dang place.
Same money spent, or less, and something to show for it.. and no legs blown off !

Mopar151

(9,983 posts)
34. We need to restore a lot of functions to the military
Wed Oct 23, 2013, 12:24 AM
Oct 2013

Food service, transportation, engineers - even if we make it a seperate service. We could employ the military to advantage in reinventing itself for a changing world, and in resuming its role as a vocational training powerhouse.

steve2470

(37,457 posts)
30. a tad off-topic but today I got as far as the "Review Eligibility" screen
Tue Oct 22, 2013, 07:13 PM
Oct 2013

I tried refreshing the screen and using the Incognito mode trick. Nothing worked.

I'm in no big hurry. My point is, the website is slowly improving. The best I could get before was to submit my application and it was filed away. Maybe in a month or so I can review all my coverage options and buy something.

Zorra

(27,670 posts)
35. I don't trust anything that doesn't give me any specifics; it's usually fluff or
Wed Oct 23, 2013, 12:45 AM
Oct 2013

weasel speak.

For instance:

"And if one of these systems -- several of which are very old in IT terms-- has a glitch and can't complete the task, the entire operation fails for that user."

and

&quot Especially as many of these systems are old and have been patched and built onto many times.)"


What are these "old" systems he's referring to (Social Security? IRS? Experian? Blackwater?) and why doesn't he name them in the article? How does he know these systems are very old?

I'm not saying the article isn't basically true, just that I have no reason to really believe it is, except to take the author's word for it, and I don't know the author.

ljm2002

(10,751 posts)
36. The first rule of thumb for systems like this...
Wed Oct 23, 2013, 01:50 AM
Oct 2013

...is: K.I.S.S. -- Keep It Simple, Stupid. It sounds like that rule was ignored completely.

One of the first reports said there were simply too many files being transferred back and forth between the server system and the user's computer, and that is why the system was so easily overloaded -- if that is the case, then the legacy systems on the server side are not the issue; rather, the issue is the design of the Web-based portion of the system.

There may be issues on the server side too, that were initially masked by the frontend issues. Things can also be simplified on the server side. You don't need realtime access to a bunch of legacy systems -- in fact, that sounds rather like inviting a nightmare scenario. Without knowing more about the system it is hard to make suggestions.

This sounds very much like a loosely managed project, not surprising when we hear how many different contracting groups were involved.

Hard to imagine how much $$$ was spent on the project, for such a poor result. If the system really has 500 million lines of code, the long term goal should be to replace it.

Buns_of_Fire

(17,181 posts)
38. Very true. Why design a system to get from A to B, when you can design a system
Wed Oct 23, 2013, 02:37 AM
Oct 2013

that goes from A to R to C to H to P to W and THEN to B. I've seen too many otherwise workable concepts made almost useless by "designers" who were more interested in justifying their positions than in producing a solid product.

Xithras

(16,191 posts)
37. "TMP reader ST" has clearly never developed enterprise-class web infrastructure.
Wed Oct 23, 2013, 01:59 AM
Oct 2013
And if one of these systems -- several of which are very old in IT terms-- has a glitch and can't complete the task, the entire operation fails for that user.

I co-owned a 25 employee software consulting company just outside the Silicon Valley (well...Dublin anyway) for a decade. I taught computer science to college students for nearly as long. If any of my employees had written a web system for an enterprise client that was incapable of queuing requests and accounting for performance differences between enterprise systems, they'd be fired on the spot for incompetence. This is pretty fundamental 101 level stuff.

Enterprise SOA requires that performance differences between new and legacy systems be accounted for, and contingencies be put in place to account for load and communication failures between the various infrastructure components. If this wasn't done, it means that someone was mindbogglingly incompetent. No web system should EVER fail simply because a call to an external service failed. Calls should be tested, and if the expected response isn't received within an allowable period, alternate processes should be in place to allow for a graceful failure, retry the requests, queue them for later, etc. This isn't an "ideal", it's a standard when writing these types of applications. Simply failing should never be an option.

Then again, with the way government contracts are handed out nowadays, the actual code was probably written by interns and H1B's, nominally "supervised" by middle managers who last wrote code when Java was still new. We used to do contract work for the state of California, and I could tell you horror stories about the crap that other contractors foisted off onto the state on the taxpayers dime.

Buns_of_Fire

(17,181 posts)
39. "Graceful failures" is the key here.
Wed Oct 23, 2013, 02:57 AM
Oct 2013

So many times, an error condition is just left hanging out there, usually by programmers who have never had to handle data-editing procedures.

Today, more emphasis seems to be placed on prettyfication (my term) than on whether or not the damned thing works.

Sloppy. And I'd appreciate it if they got off my lawn, too.

Xithras

(16,191 posts)
41. I agree completely.
Wed Oct 23, 2013, 12:01 PM
Oct 2013

There's a dearth of use-case analysis and error contingency planning skills among younger programmers today. This is a very real problem for big consulting companies, because they like to hire younger programmers who work cheap and don't complain about abuse. This problem is compounded by consulting companies that routinely gauge programmer performance using the "lines written vs. time spent" metric. It encourages programmers to write lots of fluff quickly, but discourages them from actually thinking about what they're writing, how it might be used, how efficient it might be, and how their code would react to an unexpected result, input, or failure. Time spent thinking is time NOT spent writing code.

In other words, they want code monkeys, not software engineers. Sadly, companies like these tend to win a LOT of government contracts because they have lower operating costs and can underbid competitors who actually spend time to develop reliable, high quality software solutions. It's a leading reason why my company eventually gave up on government work...we couldn't match the bids of the crap shovellers.

 

B2G

(9,766 posts)
42. I also understand that they outsourced a great deal of the programming
Wed Oct 23, 2013, 12:06 PM
Oct 2013

to India. Mistake one.

Xithras

(16,191 posts)
44. I tend to be on the fence about that one.
Wed Oct 23, 2013, 12:41 PM
Oct 2013

There are a lot of high quality developers in India who can assemble great software products.

There are far more crappy developers who can cobble something together that "works" well enough to meet the minimum project specifications.

The problem tends to be that companies and government agencies outsource because they're looking to lower costs, and the same basic rules apply in India as in the U.S.: You get the developers you're willing to pay for. If you want the low bidder, you have to expect low quality.

How's the adage go? Fast. Cheap. Good. You only get to pick two.

 

B2G

(9,766 posts)
46. My experience is that they do a fairly decent job, IF
Wed Oct 23, 2013, 12:44 PM
Oct 2013

they get crystal clear specifications.

That was obviously not the case here.

hollysmom

(5,946 posts)
61. it depends on how much the company feels it has to sell you
Wed Oct 23, 2013, 02:22 PM
Oct 2013

When first working with overseas companies they put the best people on the sell you job, but after they have the contract, they put trainees on it. I spent a lot of time talking to my indian friends, they were very nice people, but woefully under-experienced. It seems like their company would tell them they would get annual raises but fire them after a few years, the result was we were always training new people to do the same old work. Also the company lied a lot to get business. They swore they were CMS level 5, but I could never get any paperwork from them. I gave very specific specs and got back garbage, then I would ave to work until 8 PM so I could speak to someone on the phone because they were 12 off our schedule.

I presented the President a chart that showed how were were spending more to get the same work done in India from this company and the president said the tax savings would pay for it. I explained that somewhere else on this board.

winter is coming

(11,785 posts)
68. Part of the problem there is that Indian universities tend to be very strong on theory,
Wed Oct 23, 2013, 03:08 PM
Oct 2013

but not so much on actual programming, especially when it comes to larger, team-oriented projects. That, coupled with the current tendency to fire them after a couple of years, means you get a steady stream of cheap but inexperienced coders. And if they're dealing with legacy code... let's just say it takes a fair bit of patience, experience, and skill to find your way around most legacy code.

hollysmom

(5,946 posts)
72. The other thing that the Indian programmers told me as we became friends
Wed Oct 23, 2013, 03:41 PM
Oct 2013

is that culturally, you never say no to your boss. I had enough problems with American programmers feeling they could not say no, mostly in accounting firms, without having someone say that you needed to do whatever your boss asked you or get a bad review. I personally liked people who made good challenges to me and we came up with a better product. No one is perfect and if an underling has a better Idea, I was willing to go with it and give them the credit.

As to legacy code - there is good old code and bad old code. A good system has good old code easy to follow and documented properly.

MineralMan

(146,317 posts)
55. Error-handling stuff can be extremely complicated.
Wed Oct 23, 2013, 02:00 PM
Oct 2013

First, you have to understand what errors might be encountered, then deal with each error condition in a way that doesn't crash the program or log off the user. For complex systems like this one, the errors that can occur are many, especially when dealing with third party databases. If inexperienced people are doing anything to design the error handlers, they'll miss many possible error conditions and either leave the routines hanging or crash them, or just blow the whole error off and pretend it didn't happen, letting errors accumulate until some other routine causes the inevitable crash.

Errors in user input, alone, offer many opportunities for crap to come into the routine. And anticipating user errors is fraught with danger. Prompting users to fix their errors is difficult, too, and done wrong, simply compounds the error. Someone on DU wrote about the very basic thing of selecting a username. The instructions were vague, but the requirements for usernames were precise. So, many user errors result, stressing the error-handling routines and bollixing up the works in many possible ways.

You might think that someone typing on a keyboard can only make so many different errors, but it's not true. Users do incredibly stupid things, like copying and pasting weird stuff into user input fields. Validating input is crucial. If it doesn't match the template, back it goes to the user, with additional instructions. And the additional instructions have to actually get the right input from the user. If they don't, it's all a waste of time. Coding these routines takes time and imagination and then thorough testing with all possible crap that might find its way into an input field. And that's just user input.

I hated writing error handling routines. Hated it! But, what are you going to do? If you don't, your stuff doesn't work, and you constantly run the risk of crashing whatever is running. Put simply, ON ERROR RESUME NEXT is not a workable error handling routine.

Buns_of_Fire

(17,181 posts)
75. I always approached it as a game. Me vs. The User.
Wed Oct 23, 2013, 04:27 PM
Oct 2013

"Now, how many ways can they POSSIBLY screw up entering their own name?"

After my first few years in the business, I evolved into a master of defensive programming (or paranoid programming, however one might want to look at it).

That, and the fact that I HATED getting calls at 2 AM!

MineralMan

(146,317 posts)
76. Yup.
Wed Oct 23, 2013, 04:31 PM
Oct 2013

I had a small shareware software company in the late 90s and early 2000s. If I left bugs, I got support calls. I hate support calls. So, I got really good at handling user errors, and as each version emerged, it was more and more error-free. Finally, there wasn't anything left to fix. I closed the company down because shareware ceased to be a working business model.

 

Egnever

(21,506 posts)
40. I think a large part of the problem
Wed Oct 23, 2013, 03:16 AM
Oct 2013

is the sheer number of different systems it has to pull from and compile.

You have to do all the different personal data systems plus all of the different insurance companies from all the different states some with many different companies.

Pretty daunting task and I would bet nigh on impossible to test thoroughly.

Xithras

(16,191 posts)
43. Doesn't matter
Wed Oct 23, 2013, 12:35 PM
Oct 2013
is the sheer number of different systems it has to pull from and compile.

Ultimately irrelevant. Any system can be made fast and reliable if you'll spend the money to engineer and architect it properly. If your system is reliant on connections to outside data sources, you should have enough content caching and queuing routines in place to keep performance acceptable. If that wasn't possible, they should have changed the presentation model to account for slow or delayed responses from the remote servers. The notion that a site should fail because its external connections are slow is 100% amateur.

I've been writing web SOFTWARE (not pages...software) since 1995. I've worked for massive corporations, government agencies, and small startups. I've owned a consulting company with actual employees that did tens of millions of dollars in projects during its existence, and taught software development in college classrooms after we folded it up. Heck, I'm typing this very message on one screen of my four-head programming machine, while the other three screens are full of IDE's and data related to a cloud video editing solution I'm currently writing for a client in the MOOC space.

I know how to engineer large scale web projects. More importantly, a company being paid hundreds of millions of dollars SHOULD know how to do the same thing. Believe me, and all my experience, when I tell you that blaming poor site performance on "slow external services" is NOT an excuse that would be accepted by anyone with any kind of experience working on software at this scale.

Pretty daunting task and I would bet nigh on impossible to test thoroughly.
About a decade ago my company landed a ~$3 million project to design and implement a web based incident reporting and tracking system for the California EPA. As part of that project, we brought on five QA staffers who were ONLY paid to break things and document the failures. Most had programming backgrounds, but they didn't write a single line of code on the software, and weren't allowed to interact with the developers.

They were paid $25,000 each on a 3 month contract that required each of them to document at least one new bug or recommended optimization a day. Nothing was off limits...they could throw anything they wanted at the software, hack on it all they wanted, and basically try to force every flaw out of it that they could. Minor bugs like typos earned them an additional $50 bounty per report. Major bugs that caused functionality issues got them a $100 bounty per report. And if they managed to crash the system, they'd earn a $1,000 reward.

We did that for a relatively mundane government project with a relatively modest budget. This was a half-billion dollar project to launch one of the most anticipated and high-traffic government websites to be developed since the IRS went online in the 90's. They should have had a small army of QA people testing out every use-case that could be imagined. The fact that they didn't is disturbing...someone was trying to cut corners to save time & money while padding profits.
 

B2G

(9,766 posts)
45. They were given an impossible, unrealistic timeline
Wed Oct 23, 2013, 12:42 PM
Oct 2013

Money, was evidently not an object.

What will be interesting to see is what comes out from the 'worker bees'. My educated guess is they told their immediate managers that it was going go be a complete clusterfuck and they were told to sit down, shut up and code.

The blame for this fiasco falls directly on senior management/admin officials that refused to listen to objections and see the warning signs for the past year.

You can force anyone to embark on a Chinese deathmarch, but you can't make them survive it.

Xithras

(16,191 posts)
58. In which case, it's still the contractors fault.
Wed Oct 23, 2013, 02:09 PM
Oct 2013

Rule #1 when taking on consulting clients: Walk away from impossible projects. All they do is ruin your reputation and land you in court.

If the contractor took the half-billion dollar contract knowing they couldn't meet the deadlines and requirements adequately, and deliver the product that the government expected, then they committed fraud. If they didn't know that they couldn't hit those deadlines, then they were incompetent. Either way, the blame lands back on the companies that wrote the system.

hollysmom

(5,946 posts)
63. it seems the estimates are no being made by business majors.
Wed Oct 23, 2013, 02:24 PM
Oct 2013

at least in the last few companies I worked for, IT was given dates without any rationale by sales people and managers. The jsut go by what ever people ask and then expect you to do miracles under budget.

 

FarCenter

(19,429 posts)
70. There were about 5 dozen contractors that took the half billion
Wed Oct 23, 2013, 03:16 PM
Oct 2013

Centers for Medicare & Medicaid Services, a part of Health and Human Services, maintained responsibility for system integration and test.

So the "one throat to choke" is not a private company.

It's like building a house with plans you drew yourself and being your own prime contractor. If the electrical service panel won't support the HVAC system, and the kitchen appliances are incompatible with the circuits and plugs, it's your problem.

http://www.cms.gov/About-CMS/About-CMS.html
http://www.cms.gov/About-CMS/Agency-Information/History/index.html

 

Egnever

(21,506 posts)
48. So you had 3 people full time for three months
Wed Oct 23, 2013, 12:56 PM
Oct 2013

Working on a project that was tiny in comparison it's scope. And you had the full backing of the company you were doing the project for.

This site on the other hand has had nothing but people trying to sabotage it from the start by cutting funding every place they could. There was no way for them to possibly get this done in the time frame they had doing it your way. Too many road blocks and no way to know until very late in the process who all they would have to interface with. This one Web site has to serve over half the nation because of Republicans refusing to set up their own exchanges.

Excuse me while dismiss your comparison as ludicrous. There has never been a Web site ever trying to tie so many databases together and serve so many people at the same time while constantly being undermined at every turn.

Inexcusable my butt.

 

Egnever

(21,506 posts)
52. Here's just one way
Wed Oct 23, 2013, 01:08 PM
Oct 2013
http://m.healthcarefinancenews.com/news/republicans-after-failing-fund-aca-take-issue-hhs-solicitation

And if you for a second think there aren't Republican funded groups actively trying to bring that site down by any means they can be it simple ddos attacks or other means you are incredibly naive.

Xithras

(16,191 posts)
56. DDOS attacks? Really?
Wed Oct 23, 2013, 02:04 PM
Oct 2013

DDOS attacks only compromise sites that don't anticipate and plan for them. Any idiot can link up Cloudflare (or its numerous competitors) to defend themselves against that kind of thing nowadays. Hell, for only a few million of the half-billion dollars they spent, the contractor could have built their OWN Cloudflare-style filtered & distributed CDN and accomplished the same thing.

HA website architecture has changed dramatically over the past decade, and things like DDoS attacks only impact those who aren't keeping up.

There's a huge difference between running a private or low-tier standalone website in a colo or on AWS, and putting together a modern enterprise or global web system. Single points of failure get you fired nowadays.

I'm not naive, and I probably have more experience dealing with web architecture issues than 98% of the people on this board. I don't have much sympathy for low quality systems engineering, which this entire project reeks of. This was a performance failure by the contractor and developers, and they should be crucified for it.

 

B2G

(9,766 posts)
57. The original cost was estimated at around 93 million
Wed Oct 23, 2013, 02:04 PM
Oct 2013

I've seen actual costs to date to anywhere from 300-600 million.

These are not ddos attacks by some Repuke group. It has not been defunded. They have not stopped anyone from apporving incredible increases to get the system in place.

I am not the one being naive here.

Xithras

(16,191 posts)
60. QA is 100% the responsibility of the contractor.
Wed Oct 23, 2013, 02:20 PM
Oct 2013

Whether or not some elements of the government supported or sabotaged their work is irrelevant. The software and systems were designed by the contractor. The contractor is responsible for the QA, and performs it internally.

As for the rest, I just don't know what to tell you. Whether they were connecting to one source or 50 is irrelevant. If they were using proper Design Patterns to standardize error handling, pooling, queuing, or whatever load-compensating measures they selected, the actual number of connections shouldn't matter.

As for the timeline, it's the contractors duty to recognize and refuse the impossible. If you take a clients money knowing that you can't deliver the product they're requesting on the timeline they require, you're committing fraud.

 

FarCenter

(19,429 posts)
71. HealthCare.gov builders saw red flags
Wed Oct 23, 2013, 03:28 PM
Oct 2013
Outside software companies usually perform the final integration testing for a big website. Congressional investigators have concluded that a team at the Centers for Medicare & Medicaid Services (CMS), not private software developers, handled the HealthCare.gov integration testing during the final weeks.


http://www.lifehealthpro.com/2013/10/22/healthcaregov-builders-saw-red-flags

Identity management, which has been a trouble spot, appears to be a pre-existing CMS system.

EIDM is the consolidated Identity and Access Management System in CMS which is one of the largest Oracle 11gR2 Identity and Access Management deployment in the world with integrated all Oracle components to support 100 million users including providing Identity and Access Management Services for Federal Health Insurance Exchange as well as health insurance exchanges in all 50 states that use FFE level of IDM integration, and 100 of CMS federal applications.


http://www.civicagency.org/2013/10/learning-from-the-healthcare-gov-infrastructure/
 

dkf

(37,305 posts)
53. The problem is every single one of the governments systems has to be able to handle all influx.
Wed Oct 23, 2013, 01:17 PM
Oct 2013

The system is only as fast as the slowest of these. I think I read the one that checks for citizenship was never built to handle all that flow.

So it checks your info with the IRS, then sees if you or your kids are enrolled in any other federal health programs, VA, Medicare, Medicaid, chip, Indian affairs, etc etc, then it verifies your id, if you are a citizen, then it sends you to your particular state's insurance options. So many interactions for each person's entry and all of them must be decently responsive.

 

BlueStreak

(8,377 posts)
62. Yes, I suppose, but that's not where the main problems are
Wed Oct 23, 2013, 02:23 PM
Oct 2013

The problems that are most evident are simple web design things like handling exceptions gracefully.

And the ones I am seeing have nothing to do with legacy systems. I can't even get it to display the same set of policies consistently, and that surely should be a new database that is not imbedded in any legacy system.

There is no excuse for any performance problems accessing the database of available policies because:

a) it is read-only, and

b) it can easily be clustered to any decree necessary to meet the demand.

And there is no excuse for failing to catch exceptions and inform the user appropriately when these errors do occur.

This is Web design 101 stuff.

bluestate10

(10,942 posts)
54. This is an unacceptable cluster-fuck and the Obama Administration owns it. I want to rip
Wed Oct 23, 2013, 01:26 PM
Oct 2013

my fucking throat out when I hear republicans talking about how angry they are that the website isn't working properly. Every single person in the Obama Administration, including the President, should have known that republicans would fucking attack if even the smallest problem happened. The Obama Administration stepped into a steaming pile of shit when it could have avoided the problem by starting with the base assumption that the roll-out of Obamacare COULD NOT HAVE PROBLEMS, or republicans would be having orgasms all over Washington DC. I don't want to hear more from the Obama Administration threading the fine line between the law and the website, I want to hear that the fucking problem is fixed and people are signing up without problems.

 

BlueStreak

(8,377 posts)
59. Well, no.
Wed Oct 23, 2013, 02:18 PM
Oct 2013

One of the biggest problems I see now is a lack of what computer scientists call "deterministic" results. That is just a fancy way of saying that for better or worse, you expect at a minimum to get the same results each time. The architecture selected to stitch together these various computer systems is faulty to the core. It obviously does not handle the most basic error conditions gracefully. When it is impossible to display some information (because of a time-out or other error) you cannot just leave a hole in your page. You must intercept the error and inform the user what is happening.

No, I'm sorry. This is not the fault of old systems. This is incompetent web design, plain and simple.

I'm not denying it can be a challenge to integrate systems from different technologies and different eras. But this happens every single day, quite successfully, in the world of IT. It is no different from saying that the Interstate highway designer must put a slow-down ramp in place when transitioning people from the highway to city streets. Every competent highway engineer knows that. Every competent IT practitioner (and I see no evidence that there were any competent practitioners involved in the healthcare.gov site) know the things I am talking about.

Let's stop trying to rationalize this mess. It is arguably the biggest IT screw-up in 25 years. But nothing is gained by looking backwards. We need to fix it ASAP.

hollysmom

(5,946 posts)
64. A lot is gained by looking backwards
Wed Oct 23, 2013, 02:26 PM
Oct 2013

Calling good business practices back to order would be, but I don't expect that to happen.

 

BlueStreak

(8,377 posts)
65. I agree, but we're really talking about the whole government procurement process
Wed Oct 23, 2013, 02:30 PM
Oct 2013

This is no different from the trillions in Pentagon bids that end up completely wasted. As far as contractors go, it is just a game. They are never invested in the results. Their only motivation is to meet the letter of the bid and spend as little money as possible doing so.

We aren't going to change that in the next few weeks, and that won't help us get healthcare.gov working better. But I do agree it is a huge problem, and the sort of thing you never seem to hear Republicans complain about. They loves them some big government contracts.

 

DeSwiss

(27,137 posts)
73. Government & Computer = Oil & Water
Wed Oct 23, 2013, 03:58 PM
Oct 2013
- Clearly the NSA and Defense Department in-general approach computers ''differently.'' For example:

[font color=darkblue]The Department of Defense used 1760 Playstation 3s to build a supercomputer because it was the cheapest option. The United States Air Force Research Laboratory has completed what it calls the Condor Cluster, a supercomputer made entirely of PlayStation 3 consoles. The Condor Cluster, housed in Rome, New York, reportedly has capabilities heads and tails above every other interactive computer in the Department of Defense. Made of 1,760 PlayStation 3 processors and 168 general purpose processors, the Condor Cluster provides extreme power for it's relatively low cost. It has the ability to calculate 500 trillion floating point operations per second (TFLOPS), but was made for $2 million thanks to the cost of the PS3. Mark Barnell, director of the Air Force Research Laboratory's High Power Computing division, says this is "a cost savings of between 10 and 20 times for the equivalent capability." It also uses 1/10 the power of a comparable supercomputer, apparently making it "green." link[/font]


K&R

riqster

(13,986 posts)
77. S'truth.
Wed Oct 23, 2013, 04:39 PM
Oct 2013

And many of the required interactions are mandated by law; at the same time, government has refused (for decades) to adequately fund the upkeep, upgrading, or replacement of those legacy systems.

jazzimov

(1,456 posts)
78. Seems some would agree with you
Wed Oct 23, 2013, 10:12 PM
Oct 2013
BL: I almost have the sense that HealthCare.gov is in de facto shutdown. Here’s why: Government has to fix the back end before the front end. The demand here is real. I don’t think anyone can dispute that millions of people want to sign up. So if they fix the front end for consumers and thousands of people or hundreds of thousands of people being enrolled before they fix the back end, we’ll have a catastrophic mess.

When insurers are getting 10 or 20 or 50 enrollments a day they can clean the errors up manually. But they can’t do that for thousands of enrollments a day. They have to automate at some point. So I think the Obama administration doesn’t want to cross the red line to shut the system down, but I think this is effectively a shutdown in which they don’t say they’ve shut it down but it basically is shut down.


(emphasis added)
Latest Discussions»General Discussion»"Legacy Computer Sys...