DCS Security: Control Systems

Showing posts with label Control Systems. Show all posts

20 December, 2007

Wicked Cool but can you Hack it?

200KW self regulating Mini Reactor with 40 year lifespan

Update:

I want to know if it is a true fission reactor or just a decay based one.

More here don't think it is the same thing though this one looks more like an RTG.

and Here

And here Homeland Security angle

This one talks about the 4S mentioned above

06 June, 2007

Neural Networks in Process Control Environments

This is pretty interesting at Emerson

I found it via the Emerson Process Control Blog

I would think you would have to be very careful on how feedback paths (internally to the controllers or within the process itself) effect variability in the control functions (especially unforeseen cascades) but it does look like it has quite a few interesting applications.

02 April, 2007

SCADA Vuln

Hole in SCADA talked about at EWeek

Contrary to what they say this is not the "first hole found" in SCADA software though it does seem to be widely disseminated at this point. Probably falls into the category of first fully disclosed.

It is similar to a lot of the OPC crap floating around in the rest of the IT world.

To be honest with a lot of the current SCADA Ethernet equipment you don't need a hole. The front door is open.

Definitely a matter of concern if for no other reason than the spotlight is a bit brighter now. Decent article.

Update:

Dale has more on it.

20 March, 2007

PC Based Control - Huh?!?!

When the hell was this first written?

NT 4.0? "Deterministic, hard real-time operating system"? Huh? "The PLC is fundamentally a box or computer with a processor. "???

The article is dated March 2007 but if this is recent and a legit take then it shows exactly what we have to worry about, albeit unintentionally.

Don't get me wrong there is a place in many industries for properly developed "PC based" systems (whether Windows Linux or other OS) to directly control processes but I have to wonder if the author of this ever developed and implemented a truly complex integrated control environment.

Woefully uninformed and simplistic.

I have to assume this was written years ago and just dug up or perhaps relabeled. If so it shows how we got to where we are from a security perspective in the SCADA and DCS world. If not it shows very well how far we have to go.

27 February, 2007

Transport Layer Security - Part 1

Part of the Security Layer Series

Layer 4 is where the rubber meets the road as far as actual connectivity to the applications and logic of the controllers.

Layer 4 is the transport layer and for IP it typically means either TCP (Transmission control protocol) or UDP (User Datagram protocol).

I mentioned earlier that IP is inherently not deterministic and that has implications for automated control. Layer four is the first place where the compensations for this occur.

A quick run through of how TCP works will help some. I am going to grossly oversimplify here so if someone wants to correct or provide more detail feel free.

TCP establishes a session to ensure data delivery. A host initiates the communication by sending a TCP/SYN packet. The recipient of the SYN responds with a SYN/ACK with session identification information and the original host responds with an ACK/ACK establishing the session. Periodically during the communication stream the acknowledge process is repeated to ensure the communication is maintained. Checksums are included as an inherent part of the protocol. Time sent between packets received is monitored to determine if a session is lost and to initiate reestablishment of the communication stream.

What this means in a nutshell is that TCP has many mechanisms built into it that compensate (in part) for the issues introduced by the fact that IP is non deterministic. It doesn’t by any stretch of the imagination mean that TCP itself is secure in any way. There are many ways to game the system and hackers and worms use them to their full advantage. If you really want to get into the details take a look at NMAP and the lists at www.Insecure.org .

The most common one and the one I have seen cause issues on PLC’s is the syn scan. It basically works by opening up a listening port then streaming syn’s to all of the selected ports on every address that is to be inspected. Everything that responds with a syn/ack is logged. The connection is never completed with an ackack. This is where the problem is (especially for controllers with older IP stacks). The receiving host uses some resources to sit there waiting for that ack/ack. There are DoS attacks related to this but for the most part they are not that effective for newer IP stacks. (Syn floods can still cause headaches though) Unfortunately PLC’s do not always have newer stacks so they are often particularly vulnerable to this.

Aside:

This is directly relevant to the scanning discussions that have occurred with some level of passion on this blog’s comments and in the background via email. My advice here if you plan on scanning a scada system for the first time and you have done the change management it is best to start with a TCP connect scan that exits gracefully as your initial connection enumeration method. Limit the scan to a few interesting ports and don’t hit all 65k (at first at least). I wouldn’t even do fast scan ports. After you have a few under the belt for that address range then slowly expand. Do the fast scan ports then if wanted the whole 65k. After you are comfortable with this make sure you have people watching the equipment and have a recovery plan then try the syn scans. Once you have gotten past this point you can go on to the rest of your vulnerability assessment or pen test. I know this is insanely conservative for most Security professionals but the critics are not exaggerating when they want that bad things can (and will) happen. I am an advocate for scanning systems and have done so many times without significant issue on Rockwell/ABB, Honeywell, Siemens, and other vendor control systems but there is always a risk. My typical response to the DON”T SCAN crowd is “Sooner or later the systems are going to be hit by an actual attack or something that is functionally identical to one so wouldn’t you rather that happen in a controlled manner?”.

End of Aside

Many PLC vendors use TCP as their primary IP communication method to their controllers and all of them use it for their historians, MES, and control aggregation systems. I have seen a bit of an explosion in HTTP access to endpoints and I have mentioned ModBusIP in earlier posts in this series. I am not going to go into detail on what ports are used here. If you want to find out ask your vendor they will tell you. What you should do however is make sure that is possible you block access to the TCP port used as the primary PLC communication protocol at the point closest to the controllers as possible. ACL’s are acceptable if actual firewalls are not available. For vendors that use standard ports such as telnet, http, or RPC this can be somewhat more difficult to do. Take advantage of point to point and point to multipoint (subnet) rules. The key here is to not allow access to the PLC’s from an uncontrolled network. Access to the Historians and central control systems should be controlled primarily on a white list basis. For really large engagements such as regional operation centers it is often possible to isolate both the central and the local subnets and connect them via VPN tunnels. If you are doing this it is best to isolate remote sites from each other.

Enough for today

Rest of TCP and UDP continued later.

25 February, 2007

Prosoft Security

I hammered Prosoft for the typical "We have security" marketing approach but they really anted up with this post. It doesn't provide details for their solutions (which I would still like to see) but it does show that they understand this problem.

08 February, 2007

Layered Security Control Series Aggregation Post

This is the overview summary of a series of posts mapping Information Security Controls to SCADA, DCS, and ACS environments. The primary approach of the control structure is to map the controls to a modified OSI model. This is imperfect but does provide a technical framework to serve as the seed of the structure. The last half of the layers (pretty much everything beyond the host layers) departs from this model.

While these posts have specific data relating to SCADA and other control system environments much of the information is applicable to any information security environment. Many of the concepts and much of the data in the posts is relatively basic and most useful for people who are just entering into the information security and SCADA security field but there should be enough good nuggets of data that even experienced professionals will find some value in reading them.

My intention is to convert each of the sections into extended PDF’s and Pamphlets that have additional data and details over the initial posts. I am not certain when this will be done.

Building controls in multiple layers provides very strong security even with imperfect individual controls.
From an earlier post on layered controls

So if you can’t get 100% with a single control how do you get 100% or close to it?

I’ll use worms as the example because it is easy not because I think they are the most likely current threat.

If you can stop 80% of the worms with your companies external firewall.

Then stop 80% of the remaining worms with segmentation to your PCN.

Then stop 80% with a NIPS device

Then stop 80% of the remaining with a Host based firewall

Then 80% with patching

Then 80% with HIPS

Then 80% with Memory Based Protection

Etc…

If you can get an 80% reduction with each layer then you have reached your .001% likelihood layer with 6 controls even if you had a 100% certainty of the threat event occurring to begin with.

So the trick is identifying the applicable controls, determining how they (and how much they) reduce the likelihood, and if they can be layered with outer controls.

By not relying on an individual control being perfect you reduce cost (because you have a greater choice of solutions), you reduce impact on the overall system design, and you increase flexibility for your designers and end users.

The post of the series in order are:

Physical Security Layer

Data Link Layer Security Part 1

Data Link Layer Security Part 2

Networking Layer Security Part 1

Networking Layer Security Part 2

Transport Layer Security Part 1

Host Security Control Layers (being planned)

Process Controls including standards and procedural structures (TBD)

Governance Controls including visibility and audit feedback mechanism (TBD)

Financial incentives (Budgeting and leveraging business unit decisions using money and risk) (TBD)

Memetic Controls (Training, Expectation setting and Marketing) (TBD)

By properly combining the controls in these layers it is possible to get a working flexible and highly secure Operating environment that is able to adjust to problems quickly with the least amount of cost.

Safety Valve design options

Some Good safety valve designs and tips at the Emerson blog.

Is a follow on from this post on Partial Stroke testing as a supplement to standard full stroke testing.

It is a nice reminder for me that despite the fact that IP connected systems are increasing in frequency the simple designs are best especially when it comes to safety.

This brings me back to something I should have included in original Physical Security Layer and somewhat touched on in my Continuation of the Network Layer Security. It is essential that your safety systems cannot be adversely impacted by the operations or failures of any of your other systems. I mentioned in the Network layer post that they should be separate from the other networks but the real advice is that they should be as simple as possible, physically and logically isolated from all other systems (in terms of connectivity obviously placement is dependant on need and overall system design), and most importantly protected from failure modes that the other control systems might be subjected to.

This is a great blog.

Does anyone out there know if Invensys, Honeywell, Rockwell/ABB, or Siemens have a blog like this? I haven't been able to find one but if they do I would really like to add it to my RSS stack.

31 January, 2007

SCADAGard SIG

N-Circle is talking about the new Infragard SIG.

This stuff is good. He mentions the lack of awareness of security issues in the SCADA world and has a point but it is also nice to see the information security world start to take notice. It will be interesting to see the preconceptions of both sides challenged.

Between Symantec, Determina, Tenable and N-Circle the word is getting out that there is a significant market here.

That market is huge (at least the size of the existing IT market perhaps larger in terms of capital availability) and hungry for solutions that fit. Right now it is almost entirely a security vacuum. It has some real significant and important distinctions from the casual IT market but a lot of the existing solutions can be adopted to fit if done properly.

I am looking forward to the merger over the next several years.

29 January, 2007

Layer 3 – Networking - Continued

Continued From

ACL’s, Firewalls and the bottom capabilities of NIPS

If you have successfully divided your PCN subnet from the rest of you LAN’s you still have to have a way enforce that separation. Access Control Lists (ACL’s), Firewalls, and the bottom layer and capabilities of a NIPS provide a method of doing this. Note that I am not getting into ports yet. Next layer up.

At layer three they all function in a relatively similar manner and are close to being the same capability. Firewalls (and NIPS using firewalls) of any type are less likely to be susceptible to spoofing or man in the middle attacks from traffic that must traverse the PCN to the Business network but most routers and switches in the last few years have a pretty robust ACL capability. A firewall capable switch or router gives even more flexibility but isn’t always available. The real key here is how the networks are set up.

For smaller organizations a single division point and one network is all that is necessary.

In this environment you would have a PCN connected via a firewall to the business network. If the business network has access to the internet (which they all do) it is essential that that access is also protected by a firewall. This isn’t about protecting your business network so I will skip all of the details here but it is important to remember that if you have connections to your PCN then anything that compromises you business network also puts your PCN at increased risk. This means that a solid DMZ and extranet environment are important for the business network. I am writing all of the rest of this from the presumption that this is the case.

I have never seen an acceptable reason for a PLC to be directly accessible from the business networks so putting in a log any any, drop any any, (dump your logs to a syslog server) for PLC addresses should be the standard. If there is a need to directly access a PLC from a remote point (and there sometimes is) then use a VPN or some other secure authentication and communication method to facilitate the access. Terminate it on a separate subnet that has no direct external access and then route from there.

For larger companies and organizations there will be a need to provide multiple differentiated networks. Many organizations use a PCN DMZ (sometimes called a Process Information Network [PIN]) to house Historians and MES. By doing this you can granularly control access to actual control nodes while greatly simplifying secure access to data from the production nodes.

I have seen a lot of other distinctions

Utility Networks – used to house servers that pass patches, AV updates, software revisions and other utility software (be careful that it doesn’t just become the easy way around security)

ESD Network – Emergency Shutdown Network – Just like the name implies they house the systems used to shutdown in an emergency. Access is very tightly controlled often these systems are completely separated from others.

Critical Systems or Red Line Networks – For highly critical valves, pumps, breakers and gauges a critical systems network allows tight granular access and control of access for systems that may have safety or environmental significance or for systems that might have cascading failure modes.

Monitoring Network – A network where PLC’s or RTU’s are used only for monitoring functions and have no direct control capabilities. Because the risk of inadvertent operation is much lower a looser set of controls can be applied. You still must be careful that it isn’t used as a jumping point to other systems. You also have to be careful if it is used in an open loop control scenario where an operator is making control decisions based on the readings.

Legacy Network – used to separate legacy and unmanaged equipment from the rest. This is a very important network to consider. The fact of the matter is that for many automated control systems there will be hold over systems that have distinct security issues that might be better off separated from other systems.

Vendor Systems Separations – many vendors who have taken up the security hue and cry have started defining their systems within specific subneting requirements. In general this is a good thing because they can tightly control access and what traffic goes in and out based on their on hardware’s needs

Vendor PCN Extranet – An extranet subnet that houses servers to provide synchronization and control between divergent vendors OR (big OR not and) provide a controlled access drop off point for vendor access to systems for maintenance. I have seen both definitions used for the same term. If someone wants to come up with something better please do. I’ll float it and see if it catches on.

Partner PCN Extranet – Allows a controlled termination point for access either between operating partner networks or for external contractor controls either for troubleshooting or for actual operations.

Site PCN Extranet – Allows for the aggregation of information and data controls from multiple sites. It is distinguished from the PIN extranet in that actual control functions might be necessary such as on pipelines or long distance power transmission lines.

Site PIN Extranet – usually aids in the termination into a centralized control and operations center. Also provide a gathering point for production data into business systems in very large companies.

Whew...

There are actually a few more but I am stopping now. The key here is keep it as simple as possible. If adding one of the network subdivisions I mentioned above helps make control of access to those systems simpler and doesn’t make the overall design too complicated then use it. If, on the other hand, you only have a few dozen PLC’s and a single historian then the simplest solution is best. One firewall and at most two control networks, a PIN and a PCN should be fine.

Same catch phrases as always for firewall or ACL configuration. Least rights needed for effective operation. Default at the end of the chain is deny any any and above that is specific permits for the traffic that is absolutely needed. If they don’t demonstrate a defined need to get to an address don’t permit it.

If you are on a more complicated network then the business network should access the PIN and vice verse and the PCN should access the PIN and vice verse but it should be designed such that the PCN never needs to access the business network or vice verse.

ESD and Redline Networks should be locked tight except during controlled change windows.

24 January, 2007

Layer 3 – Networking - Security

There is really only one item on the networking layer that is significant from an ACS prospective but that item is a huge one.

IP is on controllers and control networks.

Of course IP is everywhere. Why wouldn’t it be?

It is so beautifully simple. Some of the best and most elegant engineering I have ever seen.

With 4 bytes of information (likely less than the amount of information required to encode two letters of your name) you can get from any computer in the world to any computer in the world and back again.

Oh, this is a bit over simplistic. There is certainly more information involved in the total train of the data movement but as far as your computer is concerned only 4 bytes matter. How simple can you get? The fractal complexity that grows from this seed is amazing.

Enough cheesiness.

The consequences of this are what make all of the other security concerns significant. If a PLC or MES is connected to an IP network (even indirectly) then anyone in the world that knows how can access them (though not necessarily easily). With controllers and MES’s the way they are currently designed that means that potentially anyone in the world can operate them. That means that anyone in the world can potentially operate the equipment they are connected to.

Everything else flows from this.

So what are the control mechanisms for layer 3?

VLAN’s
Subnetting and Subnet design
Routing
ACL’s
Firewalls
NIDS
NIPS

VLAN’s

For the most part a VLAN’s purpose in layer 2 is to logically divide and possibly isolate separate information conduits. The significance in layer three is that it is very easy to route around a VLAN as a divider. This can be done in several ways. The most common is simply using a router but dual homed systems and multi homed systems are also a threat. Basically what this means is that the control aspects gained using VLAN’s on layer 2 are useless if there is open routing of any type between the VLAN’s. Many times I have been told “oh don’t worry it is on its own VLAN”. The engineer thinks that somehow that provides isolation. It doesn’t. The point is that a protection that can be quite effective when viewed exclusively from the perspective of its own layer can be easily rendered useless at a higher or lower layer if it is not coupled with additional controls.

Subnetting and Subnet Design

By themselves subnets provide very little control. Done properly they can provide slight advantages to other controls. More importantly, if done improperly, they can actually make it impossible to secure a system by drastically reducing the options of control available.

PCN’s should be on their own subnet. There is no technical reason for a PCN to co-reside on a subnet used for other purposes. They often do because it is difficult to get a new network set up specifically for use as a PCN and there is a cost associated with separating them but in my opinion the small additional cost and amount of work is trivial compared to the amount that not separating them increases the threat environment. This is true even for non-significant PCN’s.

This one might be a bit contentious but I am a fan of using private address spaces for PCN’s. It provides some control in that it limits the potential external accessibility (ok not much but even a little can help), it helps people keep the networks separate in their minds, it doesn’t significantly impact connectivity and it allows some obfuscation of the environment at least from certain perspectives. The only real drawback is that to access it remotely NAT might be necessary (of course I kinda see this as a plus).

Keep the subnets relatively small while allowing for growth. There is absolutely no reason I can think of for having a 248 or 240 mask. If the PCN is going to be that large it wouldn’t hurt to logically divide it anyway. Increased division can also help from a redundancy and reliability standpoint by facilitating the use of routing protocols for redundant paths vs. spanning tree. Use spanning tree only for close redundancies one or two hops at most (in my opinion not even then, I am really not a fan of spanning tree I see it as an attempt to inject layer 3 functions into an inherently layer 2 protocol suite, It’s only valid function is stopping loops not providing redundancy in my mind – sorry networking religious quirk of mine) use routing for anything more significant.

If you have a large enough site to require multiple subnets and you are using private addresses (or are lucky enough to have a huge public range and choose to ignore my advice to use private ranges anyway) chose subnet breakdowns that allow for easy masking for expansions or acquisitions. (Net ranges at 16, 32 or even 64 on a 10.). This is good advice for normal networking as well. I don’t know how many organizations I have seen paint themselves into a box with 10.1, 10.2, 10.3 schemes that prevented easy logical aggregation using the octets themselves without sucking up huge ranges.

Routing

With one exception (the Gulf of Mexico’s Deepwater Rigs) almost all PCN’s I have seen have been small enough that they are end subnets on any routing network. My only real comments on this one are why route it if you don’t need to and if you do route contain the gateways and paths to something you (or at least your organization) have control of.

MPLS hasn’t caused any significant problems that I have seen yet but it can be compromised from the provider side. This compromise is not limited to watching traffic. A friend of mine and I successfully did an injection attack by replacing labels in line using a perl script. We convinced “customer” network Alice that we were an address on “customer” network Bob and pinged addresses in Alice. This was in a lab environment so this is easier said than done but it is possible. The main reason I think this is significant is that in some nations access to the nodes of the provider network might not be as controlled as in others. Of course the same risk holds true for Frame Relay and ATM but the pool of potential hostiles that are knowledgeable enough to pull it off for those two is a lot smaller. I also trust the carrier networks less because I know that many of the MPLS networks are growths from the older and uncontrolled MIP days. Frame Relay and ATM networks were never used as direct IP ISP’s. (though they did carry them at a different layer) Plus MPLS is growing like a weed because it saves the carriers money and they can pass a bit on to the customers.

Anyway you’ve been warned.

Enough writing for now. I’ll do ACL’s Firewalls and NIPS/NIDS Thursday or Friday.

SCADA Crypto

Crypto in Controlers at Digital Bond

This was also a topic of discussion at my Monday night dinner. One of the concerns for me is that as complexity is added the likelihood of unintentional failure increases.

It becomes a balance between the risk due to adding complexity and the risk of impact from either nefarious or mistaken connections.

I tend to think that we need to pursue these types of solutions now for the systems that need very tight controls and for a future environment that might be significantly more hostile. We should, however, be careful of how we deploy them.

If you look at my Ideal PCN post from a few months ago I touch on this.

Another quick comment: The Crypto isn't what matters here it is the control over access that the crypto provides that could add value.

05 December, 2006

Fact #4 - Bad Guys Know

The Bad Guys Know

List of Facts

The bad guys are now realizing that there is something here.

When I first wrote this fact more than 2 years ago it was new. Now I don't know how anyone could deny it. They have found SCADA plans with terrorists. In case you think this is a new phenomonon that is only occuring because of the current hype it was talked about back in 2002 with reconisence to back it up then and even before.

For some reason I keep running into people who say we shouldn't talk so loud about it.

Well two replies to that.

Worms and malware don't care what I say.
The real bad guys have known for years.

If you are involved in engineering control systems and you are not already developing a layered approach to security you will have a problem sooner or later. You might put it off by delaying getting scans to see how well you stand up or by stating that "we don't connect our SCADA systems to the IT network" but if you have IP connected systems (and more and more organizations do) sooner or later you'll deal with it.

It is best to deal with it in a controled environment.

29 November, 2006

Fact # 3 - Standardization - IT and DCS Merging

Standardization with existing IT vendors is happening to SCADA systems and is subjecting new areas (Control systems) to old threats (Hackers and worms). This results in the creation significant risks to safety, environment and business.

Part 1 Here

More and more control systems, historians and actual control devices are adopting standard readily available operating systems, communication protocols and connection mechanisms. This subjects control systems to the same threats that plague other IT systems. It also gives them some significant advantages in both dealing with the threats and providing service. The rate of this occuring is also accelerating.

One of the largest items that conserns me about this fact is the reduced cycle times of deployments. Older Control Systems could litterally go for decades and work fine. In Jake's comment to my IT vs Control Engineer challenge he points out the breakneck speed that IT systems have to respond to new threats. In pure info security circles we have moved from hacks to worms to zero days now to less than zero day threats. (This was an interesting thread to watch develop it starts here.)

As more moves (and it is going to move wether we want it to or not) this is just going to get more pronounced.

I recently participated in an email go around about the lack of support for NT4 with a few industry heavyweights and how we communicate the risk this entails to the ACS community as a whole. One of them liked this article from 2001 about NT4 SP7.

Hell in the IT world companies are born, grow, go through a mid life crisis and either go out of business or a gobbled up and disassembled in a quarter of the time that most engineers expect their overall plant control system to last.

How many out there still have VAX, DEC and Compaq? How about IP21 systems? The list goes on.

They work fine the problem is that if you want to buy a new system you have to get Microsoft, Linux, or AIX as the OS. (yes I know the PLC's are different, I am mostly talking about the historians and control stations here, it still matters)

There really is very little choice. This means you have to be thinking about what you are going to do when Vista has been out for 4 years and MS (rightly) refuses to support 2003 let alone 2000.

There are a lot of implications to compressing the cycle time from 50 years to 20, to 10 and then to 5 or less. I think this is the biggest fact we have to prepare for but certainly not the only.

20 November, 2006

Scanning Vs Not Scanning - This deserves to move out of comments

Tell me if I am wrong and why. Give me another option.

CNI operator said...
Jim, you already know my views on this!My view on scanning was re-enforced when I completly wiped a vendors PLC level device during a test in their lab.Before scanning, I'd need to be absolutely sure of whats on the network and be sure the devices can stand up to the scan.
20/11/06 4:03 PM

Jim C said...
I agree. I am not saying to go willy nilly and pull down Nessus and start a scan.

What I am saying is that after you make sure your systems can handle specific settings and after you have informed all of the right people and once you get the right people watching the scan live and the right operators involved.
Then you can scan.
Carefully.

Think of it as a test plan. Once you are comfortable with it then go ahead.

You always need change control and you always need to understand the implications if something goes wrong and be able to adjust for them.

With all of that said every security professional out there has made a mistake scanning. This is doubly true for people that haven't grown through the IT Security ranks. (and dealt with the scanning disasters there) There is a whole religion thing about if it is ok or not to scan on the IT side let alone on the CNI side.

My take is this. If it can be done properly (and it can) then if you don't scan you don't know what can go wrong. You have no idea what the environment is like.

Doing security design in that environment is like a doctor performing surgery with a blindfold and oven mits. You are lucky if you can even pick up the right tools.

Many good security professionals have gotten bitten by bad scans. In the SCADA world it makes sense to be extra careful. Especially after seeing what can happen but it doesn't mean they don't add value.

The Key point to the Myths is to make sure that CNI guys know that there is no difference between IT systems and DCS systems and so that IT guys know they are not the same.

That statement is not an oxymoron. Within context for each group it is true.

I have done hundreds of scans on PCN's successfully without problems. I wouldn't let just anyone do it but it is possible and more it is essential.
20/11/06 4:23 PM

Update:

More

and from Rich at Securosis

Fact # 2 - SCADA Deperimeterization is here

Deperimeterization is happening with DCS whether we like it or not.

Part 1

At the same time that many organizations are scrambling to insert protective firewalls for their Automation systems, business and operational needs are increasing the inter-connectivity of the systems that need protection. In the case of automation systems the real risk might be the inability to monitor the operation and respond to changing operational dynamics and less the improper access by a small subset of individuals. Because of these competing requirements even when strong perimeter controls are implemented they rapidly atrophy in effectiveness. Firewalls become so riddled with holes that their ability to provide control functions is severely limited. It is naive to assume that control systems can be isolated.

Look having outside connectivity sometimes provides more value than the risk it incurs. This is especially true for monitoring only systems. Say there is a rig or other high location that has a reading that has to be taken periodically. By having an RTU up there it is no longer necessary for some one to climb up. As a matter of fact they don't even have to get in a truck to drive out near it. They can read it from the comfort of the maintenance shack. That is a huge safety improvement. Now a lot of people would argue that this is simple but for many organizations they are still climbing ladders to get readings on a regular basis. Installation of a cheap and easy RTU literally can save lives here not to mention adding to accuracy and precision which will ultimately result in savings. Even 1% or 2% can mean the difference between profitability and loss for some low end sites and they cannot afford complex security arrangements.

(By the way Tofino might be an answer for them. Eric has the site up for his new company. I'll post more later after I get a chance to talk to him.)

In a different scenario you have a large complex site with thousands of variables. In a location like this the interconnection to the PCN provides many many essential functions. Many actually most significant accidents could have been avoided by having the right people know the right data earlier. Historian feeds to external aggregation points allows engineers across the world monitor and troubleshoot. Expert talent can be pooled and can always see data from major sites. Subject matter experts can see the data real time. Not only can this improve safety and efficiency in a lot of companies it is and has. Other improvements are in logistics (both supplying and planning production to feed customers), maintenance, capacity planning, and many others.

Suffice it to say that these systems need to talk to the real world and vice verse. Firewalls are a must (at least for open and closed loop controls) but just like in the IT world their utility is out of date and waining. More needs to happen. Mike at my company likes to use a statement (that he claims is several steps from its originator via the CTO of N-Circle) that fits this process.

"Firesuits not Firewalls."

I don't think anyone is advocating complete elimination of firewalls the key is that they are not enough.

Patching has to happen and has to be able to happen quickly. Access controls have to exist and be enforced. Behavior based protections have to be applied (within reason). Memory protection should be considered. It is essential that you know what your environment looks like and what it is vulnerable to using tools like Nessus and CoreImpact. Things have to be measured.

The environment has to be monitored.

Not all of these will apply to every system of course but overall all of the tired cliches need to be followed. The key is that they are essential in SCADA systems as well.

The perimeter is leaving SCADA because there is more good to be gained than bad (like it or not from a security perspective) so it is time to adapt your security strategy.

17 November, 2006

First Fact - Exponential Yes "Exponential" Growth

Automation control systems are expanding exponentially in complexity, numbers, interconnectivity, and capability.

When I first floated this one I had several people challenge the exponential piece. They were right to do so but I am standing by the statement.

There are any number of ways this can be verified but the easiest is to read the financial reports of the companies that specialize in these systems. For the last several years quarter on quarter they have shown consistent unit and revenue growth. You might not see this in individual purchasing companies but I bet in the larger ones you will see the same type of growth if the entire company is looked at. Certainly I see it in the budgets I have seen.

If you want to get away from financial indicators then look at the number of PLC's and RTU's that are being aggregated in your historians and monitoring centers. I know a couple of oil companies that are devoting whole building floors to the systems and maintaining them.

This is also backed up by power industry news that shows clear areas where growth is available.

I will make one concession though. Even though the current growth rate appears exponential we haven't yet reached the asymptote of the curve and even if we do it doesn't mean that there isn't a plateau that will occur later.

Regardless right now automation, SCADA, DCS and PCN systems are undergoing the same explosive growth that all other areas of the information industry have and are. There is a several year lag due to differences in the capital cycle and implementation and usage but the overall trend is identical. This means that we can expect more and more interconnectivity, accessible systems and most importantly more direct control of functions that previously could not be easily controlled remotely. Furthermore we should expect these cycles to shorten in length and to converge towards the technological adoption rates seen elsewhere. No more 20 year cycles 10, 8, 5 or even 2 (when looking at software portions) will begin to emerge.

This obviously has security implications but it also means great benefits are being accrued.

Part 1 Here

DCS Facts that we have to deal with

Last week I did a series on 5 Myths of Process Control Security.

This is a follow on series to that one. There are a number of facts we have to deal with in the SCADA security world and these are 5 of them. Most of these look bad on the surface but there are some underlying advantages that might not be apparent so:

Pandora's box is open or the genie is out of the bottle your choice

1. Automation control systems are expanding exponentially in complexity, numbers, interconnectivity, and capability.

2. Deperimeterization is happening with DCS whether we like it or not.

3. Standardization with existing IT vendors is happening to SCADA systems and is subjecting new areas (Control systems) to old threats (Hackers and worms). This results in the creation significant risks to safety, environment and business.

4. The bad guys are now realizing that there is something here but so are the good guys.

5. Bad things have already happened and more will.

Part 2

16 November, 2006

5 Myths - Part 6 of 6

Part 1

Myth # 5 - You cannot scan or update Automated Control systems.

Scanning and updates are just as essential for these systems (or more important because of the geographic and ownership distributions) as any other IT system. Scanning and updating needs to be done carefully, within change management and with good communication to the users of the systems.

The key phrase here is change management. All stake holders must know when and how the scans will occur. From the Engineering Authority to the operators (current, off going and oncoming) everyone must be informed. This also means that you need a tool to do the scanning that is able to track and log (verifiably) to the second exactly what it is doing to the end system.

The last part is why I prefer CoreImpact over Nessus.

Both are good but Core gives you verifiable CYA. (and in many cases easier granular control)

In all cases you should know what you are doing to what, when, and why and be able to explain it to the engineers and operators. If you can't then you shouldn't be doing the scan.

With the caveats made once you get the process down it becomes a non event (other than fixing the problems that are found which for a while will be many). It was a weekly event at one of the companies I was CISO at.

As for updates not only is it possible to do them it is essential that they are done. Again with proper change management not just arbitrarily.

14 November, 2006

Control Systems

I think it is time to revive one of my pre blog writings trying to get people interested in security issues around SCADA systems. If you want to know what they are and why they matter take the time to read this. It is a big problem and getting bigger. If you are already into the topic it might give you some catch phrases and other angles for the non initiated. A little Campy but I didn't write it for a blog.

Control systems are everywhere. From nuclear plants to elevators, automobile manufacturing robots to remote surgery, multibillion dollar offshore oil facilities to children’s toys, computers are controlling more things every day. Automated control systems are not a new phenomenon. In the first decade of the 1800’s punch cards were used to operate weaving looms running relatively primitive programs using mechanical interlocks and stops to control the equipment. Digital computerized systems have been around for more than a half century and short range radio monitoring and control existed well before wireless became a common term in the IT world. These systems go by many different names, Automated Control Systems (ACS), SCADA, DCS and Process Control Systems are among the most common. Ultimately what they do is what defines them as a separate category. Whatever the name, the defining function of control systems is their ability to directly physically manipulate the real world.

In the last decade there has been a revolution in the automated control world. Mirroring the advancements of the information technology world, ACS have become more integrated, easier to connect to, and standardized.

Many systems are now directly or indirectly connected to an IP network that is ultimately connected to the internet. The key control and programming point of these systems is often run as an application on one of the common Operating Systems.

This standardization and interconnectivity has had a dramatic positive effect on the efficiency, safety and ease of implementation of these systems.

Because these systems are often more complicated than other computing systems, have a higher capital cost than other computing systems, and are tied to physical infrastructure, the adoption of the newest generation lags the IT and internet world by 8 to 10 years. This puts the ACS world right in the middle of the turn of the millennium IT environment. The same paradigms apply. There are and will be dramatic impacts on business models. Irrational exuberance abounds. A huge amount of money is being spent and saved.

Finally the security challenges of the early internet days are now being felt in systems that control our power, water distribution, oil pipelines and wastewater removal plants.

This final point cannot be overstated. The same viruses, spam, pop-ups and botnets that give the IT world and the average home PC user headaches can affect the power supply to your house and business and change the way that the natural gas pipeline in the back of the neighborhood works.

There are two key questions that define the debate about how or even whether to direct resources to protecting ACS. Can control systems be accessed and controlled by unwanted individuals? What will/can happen if they do access them?

The answer the first question is a direct and simple yes. Not only can these systems be accessed but they have been accessed. If a system is connected to any other IT or telecomm system then it can be reached and controlled.

The answer to the second is less direct. It depends. It depends on what the ACS is controlling, how much and how fast a human can get involved and most importantly how the underlying system integrates into the process being controlled. In most cases production can be stopped or efficiency impacted. In some cases people can be hurt or killed, large amounts of environmental harm can occur, and huge amounts of money can be lost.

A number of high profile incidents are easy to find.

The California power grid was compromised and service was almost interrupted, waste water has fouled beaches, David Beckham’s car was unlocked, started and stolen twice, and the slammer worm was found in the systems of a nuclear plant.

From the silly to the terrifying, compromises of automated controls systems are occurring daily. Ultimately these incidents show the public side of the impact but the real threats can be subtle. Control systems are not designed to identify abuse and hacking. Until recently identification of attacks specifically directed at ACS was not available or possible. In many organizations the control systems are not located on a segment of the network that allows easy differentiation of unwanted traffic. The result of these and other weaknesses in existing architectures is that the real level of compromise and therefore the threat and risk levels are difficult or impossible to determine for most organizations without the acquisition of greater information and understanding.

People are doing things to fix it but more needs to be done and faster.