Enron Mail

From:keith.dziadek@enron.com
To:dan.dietrich@enron.com, tim.belden@enron.com, murray.o'neil@enron.com,arshak.sarkissian@enron.com
Subject:RE: IT problems Sunday
Cc:chip.cox@enron.com, diana.willigerod@enron.com, sammy.abu-khalaf@enron.com,bruce.smith@enron.com, david.poston@enron.com, steve.nat@enron.com, jeff.richter@enron.com, john.forney@enron.com, mark.guzman@enron.com, monika.causholli@enron.com
Bcc:chip.cox@enron.com, diana.willigerod@enron.com, sammy.abu-khalaf@enron.com,bruce.smith@enron.com, david.poston@enron.com, steve.nat@enron.com, jeff.richter@enron.com, john.forney@enron.com, mark.guzman@enron.com, monika.causholli@enron.com
Date:Mon, 27 Nov 2000 08:21:00 -0800 (PST)

Sunday, November 26, at approximately 10:30 CST, EIGRP (routing protocol)
neighbor connections between the Portland router and the Houston router began
to timeout. The frequency of this occurrence was mentioned in the previous
email from Phillip Platter. The problem cleared up at 16:56 CST. When
debugging the problem, all test results pointed to a problem on the EIN.
However when the EBS NOC was contacted, they informed us that there had been
no known network issues or scheduled changes. Because of what I experienced
while debugging, I had an EBS network engineer come to my office this morning
to walk thru the historical log data of physical devices within the Houston
to Portland path. The problem was finally pinpointed to an EIN DS3 link . A
GSR router to which this link was attached was losing OSPF neighbor
connections with it's peer on the other side of the link due to a bouncing
interface, thus causing the EIGRP neighbor connectivity errors.

Realizing this is a critical link on which 24x7 trading is conducted, the
following actions items will be addressed:
? We will be implementing our own monitoring device (currently only able to
monitor up/down/response time) on the edge of the EIN in an attempt to more
proactively monitor the path being utilized.
? I will verify with EBS that all appropriate devices are being monitored by
the NOC and that a notification process be implemented (should include both
problem and change notification).
? I will push the issue of gaining access to the EIN edge router. This will
allow us to more efficiently troubleshoot/manage/monitor EIN link status and
errors. Knowing the error type and frequency will allow for quicker
determination on severity and, therefore, implementation of secondary
solutions.



Arshak,
Because of the following concerns, it appears to me that the NOC is not
appropriately monitoring EIN devices and, in return, negatively impacting
Enron business. I am copying you on this note because I need your assistance
in identifying a contact for the NOC in which to communicate these concerns
and to work with in implementing processes to help alleviate future issues.

Concerns:
? Time required to fix this problem (6.5 hours - I think it finally cleared
up on its own)
? The lack of notification or knowledge on the problem
? The erroneous information being communicated to the customer (Enron Net
Works) .

Keith



-----Original Message-----
From: Dietrich, Dan
Sent: Monday, November 27, 2000 9:35 AM
To: Belden, Tim; O'Neil, Murray
Cc: Cox, Chip; Willigerod, Diana; Abu-Khalaf, Sammy; Bruce
Smith/HOU/ECT@ENRON; Poston, David; Nat, Steve; Dziadek, Keith; Richter,
Jeff; Forney, John; Guzman, Mark; Causholli, Monika
Subject: IT problems Sunday

Tim and Murray,

Yesterday the real-time desk experienced dropped Terminal Server (TS)
sessions which impacted their productivity. Diana and Chip contacted the
network team in Houston to assist with troubleshooting the problem. Keith
Dziadek monitored the DS3 VPN connection between Portland and Houston and has
determined that it is the dynamic fail-over within the Enron Intelligent
Network (EIN) which is causing the dropped TS sessions. This dynamic
fail-over is a good thing. Our issue is that the sensitivity of the TS
sessions is so high that the time of the dynamic fail-over on the EIN causes
them to drop while all other applications (i.e. Enpower running locally in
Portland) are not affected.

Keith is meeting with his team in Houston as well as staff from EBS today to
form an action plan. Once the action plan has been established, Keith or I
will communicate what and when it will be done to you via email.

You can reach me on my cel if you have any questions.

Regards,

Dan Dietrich
Cel: 403-818-0815


---------------------- Forwarded by Diana Willigerod/PDX/ECT on 11/26/2000
03:33 PM ---------------------------

From: Phillip Platter 11/26/2000 03:19 PM



To: Jeff Richter/HOU/ECT@ECT, Tim Belden/HOU/ECT@ECT, John M
Forney/HOU/ECT@ECT, Murray P O'Neil/HOU/ECT@ECT
cc: Mark Guzman/PDX/ECT@ECT, Monika Causholli/PDX/ECT@ECT, Diana
Willigerod/PDX/ECT@ECT, Chip Cox/PDX/ECT@ECT, Paul Kane/PDX/ECT@ECT

Subject: IT problems Sunday

Please be advised of a serious problem with Terminal Server!

We were all kicked out of Terminal Server consistently on Sunday. At first
it happened about every half hour (9am to 10:30a.m). Then it happened every
5 minutes.
Communication seemed slow. We paged and called the Portland IT Team at
11:30 (Diana & Chip). At 12noon Diana said Chip and Paul Kane were working
on it. At around 12:30 Paul Kane called back and said "Vince" in Houston
was working on it. We asked for Vince's Phone number and Paul stated that
Vince would call us instead. Vince called at about 1:00 and said everything
was now working. Five minutes later we were still getting kicked out. We
then paged and called Chip and he said he was trying to get ahold of
Houston. At 1:30 we called the Resolution Center. They indicated they had no
clue we were having problems. At 1: 45 we called Greg Marsalis in Houston and
he said he was unaware of who was working on it, but was going to look into
it.

Greg did look into it and said they were still not able to pinpoint the
problem. At this writing,(3:10p.m.) we are still not able to use Terminal
Server and consequently are unable to transact with the ISO or to enter
deals in enpower. This has also affected our connection to PX Trade App. We
are effectively prevented from taking advantage of market conditions in
California. Earlier I had started to initiate some congestion relief on the
Palo Verde tie . Cong was $225 to $250 at Palo and I was confident I could
take advantage by moving power out of SP. I was unable to do anything in
Caps as I was kicked out every few minutes.

I feel the communication process regarding this problem has been flawed. I'm
sure there is plenty of blame to go around . I suggest we meet with IT to
define a process where we will get immediate attention and results.

As of 3:20 we still have the same problems.

I am hoping this problem is isolated to Sunday.