Web Admin Blog - Real Web Admins. Real World Experience.

Software Assurance Maturity Model (SAMM)

This presentation on the OWASP Software Assurance Maturity Model (SAMM) was by Pravir Chandra, the project lead. I was actually really excited in seeing this topic on the schedule as SAMM is something that I’ve been toying with for my organization for a while. It’s actually a very simple and intuitive approach to how to assess where your organization is at as far as software maturity, where you want to get to, and how to get there. My notes on this presentation are below:

By the end of the presentation should be able to….

Evaluate an organizations existing software security practices
Build a balanced software security assurance program in well-defined iterations
Demonstrate concrete improvements to a security assessment program
Define and measure security-related activities throughout the organization

Lessons Learned

Microsoft SDL
- Heavyweight, good for large ISVs
Touchpoints
- High-level, not enough details to execute against
CLASP
- Large collection of activities, but no priority ordering
ALL: Good for experts to use as a guide, but hard for non-security folkds to use off the shelf

Drivers for a Maturity Model

An organization’s behavior changes slowly over time
- Changes must be iterative while working toward long-term goals
There is no single recipe that works for all organizations
- A solution must enable risk-based choices tailored to the organization
Guidance related to security activities must be prescriptive
- A solution must provide enough details for non-security-people
Overall, must be simple, well-defined, and measurable

Therefore, a viable model must…

Define building blocks for an assurance program
- Delineate all functions within an organization that could be improved over time
Define how building blocks should be combined
- Make creating change in iterations a no-brainer

SAMM Business Functions (4 in total)

Start with the core activities tied to any organization performing software development
Named generically, but should resonate with any developer or manager
Governance, Construction, Verification, Deployment

SAMM Security Practices (12 in total)

From each of the Business Functions, 3 Security Practices are defined
The Security Practices cover all areas relevant to software security assurance
Each one is a ‘silo’ for improvement
Governance: Strategy & Metrics, Education & Guidance, Policy & Compliance
Construction: Threat Assessment, Security Requirements, Secure Architecture
Verification: Design Review, Code Review, Security Testing
Deployment: Vulnerability Management, Environment Hardening, Operational Enablement

What is “software”?

Lots of different aspects of what software is
Could be a tarball of source code, UML and specifications, or a server running the code

Under each Security Practice

Three successive Objectives under each Practice define how it can be improved over time
Level 1, Level 2, and Level 3
“Going from crawling to walking to running”
72 different actives all about the size of a bread box

Per Level, SAMM defines…

Objectives
Activites
Results
Success Metrics (2-4 metrics for each objective)
Costs (training, content, license, or buildout)
Personnel (overhead on different roles because operating at this level)

Conducting Assessments

SAMM includes assessment worksheets for each Security Practice

Assessment Process

Supports both lightweight and detailed assessments
Organizations may fall in between levels (+)

Creating Scorecards

Gap Analysis
- Capturing scores from detailed assessments versus expected performance levels
Demonstrating Improvement
- Capturing scores from before and after an iteration of assurance program buld-out
Ongoing Measurement
- Capturing scores over consistent tiem frames for an assurance program that is already in place

Roadmap Templates

To make the “building blocks” usable, SAMM defines Roadmaps templates for typical kinds of organizations
- Independent SW Vendors
- Online Service Providers
- Financial Services Organizations
- Government Organizations
Organization types chose because
- They represent common use-cases
- Each organization has variations in typical software-induced risk
- Optimal creation of an assurance program is different for each

Expert Contributions

Build based on collected experiences with 100’s of organizations
- Including security experts, developers, architects, development managers, IT managers

Industry Support

Several case studies already
Several more case studies underway

The OpenSAMM Project

http://www.opensamm.org
Dedicated to defining, improving, and testing the SAMM framework
Always vendor-neutral, but lots of industry participation
Targeting new releases every ~18 months
Change management process

Future Plans

Mappings to existing standards and regulations
Additional roadmaps where need is identified
Additional case studies

Tags: assurance, maturity, model, owasp, samm, software

Enterprise Application Security – GE’s Approach to Solving Root Cause

Nov.12, 2009 in OWASP AppSec DC 2009, Security Leave a Comment

The first presentation of the day that I went to was by GE’s Darren Challey and was about GE’s application security program and how he took a holistic approach to securing the enterprise. My notes on this presentation are below:

Why is AppSec so hard?

AppSec changes rapidly (look at difference between 2004, 2007, and 2010 Top 10)
Changing landscape
- Increase skill and talen t pool of technically proficient individuals willing to break the law
- Growing volume of financially valuable data online
- Development of criminal markets (black markets) to facilitate conversion to money
“Attackers now have effective skills, something to steal, and a place to sell it”

Application Security is a complete one-sided game
Need to become an enabler (not a barrier)
Must inject application security earlier through Guidance, Education, and Tools
Must understand the development and deployment process and integrate rather than mandate
NIST study on cost to repair defects when found at different stages of software development (http://www.nist.gov/director/prog-ofc/report02-3.pdf)
Solving the problem of the enterprise (Culture Change)
Success factors
Form a mission and strategy
Develop policy (but not corporate “mandate”)
Gain executive buy-in (cost / benefit / risk)
Understand the magnitude of problem (metrics)
Asset inventory and vulnerability management
Develop standards (what should I do and when?)
Establish a formal program (strong leadership)
Focus on education and training materials
Develop in-house expertise, services and “COE”
Continuous improvement, measurement, KPI
Communicate!
Drive a culture change (shared need, WIIFM)
Communicate expectations with vendors
Implement incentives (and penalties)
Digitize after the process is solid (tools)
AppSec program mission & structure
AppSec program strategy
Policy (guidance) -> Standards (Guidance) -> Training (Education) -> Metrics (tools) -> Security tools (tools) -> Inventory & tracking (tools) -> Monitor & Improve

Guidance

“GE Application Security Working Group” (Talking to the businesses is critical! Meet every 2 weeks.)
Secure Coding Guidelines
Vulnerability Remediation Guide
Secure Deployment
Quick Reference Card
Contractual Language
Desk Calendars
Metrics: AppSec calendars helped increase visitors to key Guidance materials (track hits to website docs when certain activities take place)

Education

CBT1: Intro to AppSec at GE (60 min for any IT person) – why AppSec is important and what happens when you don’t do it
CBT2: GE Best Practices for Secure Coding (90 min)
CBT3: Attack Profiles & Countermeasures (120 min for security people)
Developer Awareness Assessment:
- 100’s of internally-developed questions
- Randomized questions, timed completion
- Vendors track their own resutls
- Allows tailoring of training/awareness programs

Tools

– COE AppSec assessment services
Vendor framework & Metrics
Compliance handbook
Common objects repository
GE Enterprise Application Security
Scanning and Monitoring tools
Automation is the way to go (but the tools are not quite there yet)

Metrics

Measure Vendor AppSec Performance (Avg % Critical/High Vulnerabilities per Assessment vs % Assessments with Zero Critical/High Vulnerabilities)
Is it making a difference (map avg of critical/high vulnerabilities per assessment)

Forming a Center of Excellence

Combines the best available people, processes and tools
Formal training & defined roles (Comprehensive training program for all auditors to ensure skills are kept current and that auditors can provide more than one type of service)
COE Team structure (tools, research, operations, stakeholder management, queue management, application security auditors
Application Assessment Types (black/grey box vs white box)
Application assessment process (map of the workflow with “swim lanes” of who does each step)
Measure number of vulnerabilities and severities
Measure customer satisfaction (overall, ease of engagement, responsiveness)

Tags: application, education, enterprise, ge, metrics, Security, tools

All About OWASP

Nov.12, 2009 in OWASP AppSec DC 2009 Leave a Comment

The second presentation of the morning was various members of the OWASP board speaking about the goals of OWASP for the upcoming year. My summary is below.

Jeff Williams

Cross Site Scripting is an epidemic
We need to view insecure software as a disgrace
Everything OWASP is free and void of commercialism
“When information comes with an agenda, people discount it”

Tom Brenan

Global Membership Committe 2010 Focus

Global expansion
7x increase (2008)
Vote your board members

Global Industry Committee 2010 Focus

Building industry special interest groups
Continuing to impact regulation (NIST, government, organizations, EU)

Dave Wichers

Global Conferences Committee 2010 Focus

Support four global AppSec Conferences per year
Support OWASP regional and local events worldwide

Sebastian Deleersnyder

Global Education Committee 2010 Focus

Academic outreach
OWASP bootcamp
Roll out college OWASP education kits

Global Chapter Committee 2010 Focus

Identify and reactive inactive chapters
Actively support chapters with mentors and speakers
College OWASP education kits

Dinis Cruz

Global Projects Committee 2010 Focus

Apply assessment criteria version 2 to all projects
Unified dashboard for OWASP projects
Launch and manage 2010 season of code

Tags: 2010, board, focus, owasp

Keynote: Collaboratively Advancing Strategies to Mitigate Software Supply Chain Risks

Nov.12, 2009 in OWASP AppSec DC 2009 Leave a Comment

It’s my second year at the OWASP AppSec Conference and this year it is in Washington, DC. The New York City Conference last year proved to be probably the best conference I’ve ever been to. Based on the agenda and the facilities, this year is looking very promising. Today’s keynote is by Joe Jarzombeck, the Director for Software Assurance at the National Cyber Security Division for the Office of the Assistant Secretary of Cybersecurity and Communication. Man, is that a mouthful. My notes on the presentation are below:

DHS NCSD Software Assurance Program

A public/private collaboration that promotes security and software resilence throughout the SDLC
Reduce exploitable software weaknesses
Address means to improve capabilities that routinely develop, acquire, and deploy resilent software products
IT/Software Security risk landscape is a convergence between “defense in depth” and “defense in breadth”
Applications now cut through the security perimeter
Rather than attempt to break or defeat network or system security, hackers opt to target application software to circumvent security controls
- 75% of hacks are at the application level
- Most exploitable software vulnerabilities are attributed to non-secure coding practices
Enable software supply chain transparency
- Acquisition managers and users factored risks posed by software supply chain as part of the trade-space in risk mitigation efforts
DHS Software Assurance program scoped to address:
- Trustworthiness
- Dependability
- Survivability
- Conformity
Standalone Common Body of Knowledge (CBK) drawing upon contributing companies/industries

Build Security In: https://buildsecurityin.us-cert.gov

Focus on making software security a normal part of software engineering
Process agnostic lifestyle
There was an interesting slide on touchpoints and artifacts that I took a picture of with my phone and I will try to post here.

Resources to Check Out

“Software Security Engineering: A Guide for Project Managers”

“Enhancing the Development Lifecycle to Produce Secure Software”

Fundamental Practices for Secure Software Development

Click to access SAFECode_Dev_Practices1008.pdf

The Software Assurance Pocket Guide Series

Software Assurance in Acquisition: Mitigating Risks to the Enterprise

Check out Appendix D – Software Due Diligence Questionnaires

“Making the Business Case for Software Assurance”

“Measuring … Assurance”

Common Weakness Enumeration (CWE)

If you have this weakness, then it’s not a matter of if, but when you’ll be breached.

Tags: assurance, chain, mitigate, risks, software, strategies, strategy, supply

Dang, People Still Love Them Some IE6

Sep.11, 2009 in Browsers Leave a Comment

We get a decent bit of Web traffic here on our site. I was looking at the browser and platform breakdowns and was surprised to see IE6 still in the lead! I’m not sure if these stats are representative of “the Internet in general” but I am willing to bet they are representative of enterprise-type users, and we get enough traffic that most statistical noise should be filtered out. I thought I’d share this; most of the browser market share research out there is more concerned with the IE vs Firefox (vs whoever) competition aspect and less about useful information like versions. Heck we had to do custom work to get the Firefox version numbers; our Web analytics vendor doesn’t even provide that. In the age of more Flash and Silverlight and other fancy schmancy browser tricks, disregarding what versions and capabilites your users run is probably a bad idea.

IE6 – 23.46%
IE7 – 21.37%
Firefox 3.5 – 17.28%
IE8 – 14.62%
Firefox 3 – 12.52%
Chrome – 4.38%
Opera 9 – 2.20%
Safari – 1.95%
Firefox 2 – 1.27%
Mozilla – 0.48%

It’s pretty interesting to see how many people are still using that old of a browser, probably the one their system came loaded with originally. On the Firefox users, you see the opposite trend – most are using the newest and it tails off from there, probably what people “expect” to see. The IE users start with the oldest and tail towards the newest! You’d think that more people’s IT departments would have mandated newer versions at least. I wish we could see what percentage of our users are hitting “from work” vs. “from home” to see if this data is showing a wide disparity between business and consumer browser tech mix.

Bonus stats – Top OSes!

Windows XP – 76.5%
Windows Vista – 14.3%
Mac – 2.7%
Windows NT – 1.8%
Linux – 1.8%
Win2k – 1.5%
Windows Server 2003 – 1.2%

Short form – “everyone uses XP.” Helps explain the IE6 popularity because that’s what XP shipped with.

Edit – maybe everyone but me knew this, but there’s a pretty cool “Market Share” site that lets people see in depth stats from a large body of data… Their browser and OS numbers validate ours pretty closely.

Tags: analytics, browser

Oracle + BEA Update

Jul.16, 2009 in Software and Tools Leave a Comment

A year ago I wrote about Oracle’s plan on how to combine BEA Weblogic and OAS. A long time went by before any more information appeared – we met with our Oracle reps last week to figure out what the deal is. The answer wasn’t much more clear than it was way back last year. They do certainly want some kind of money to “upgrade” but it seems poorly thought through.

OAS came in various versions – Java, Standard, Standard One, Enterprise, and then the SOA Suite versions. The new BEA, now “Fusion Middleware 11g” comes in different versions as well.

WLS Standard
WLS Enterprise – adds clustering, costs double
WLS Suite – adds Coherence, Enterprise Manager, and JRockit realtime, costs quadruple

But they can’t tell us what OAS product maps to what FMW version.

There is also an oddly stripped down “Basic” edition which noted as being a free upgrade from OAS SE but it strips out a lot of JMS and WS stuff; there’s an entire slide of stuff that gets stripped out and it’s hard to say if this would be feasible for us.

As for SOA Suite, “We totally just don’t know.”

Come on Oracle, you’ve had a year to get this put together. It’s pretty simple, there’s not all that many older and newer products. I suspect they’re being vague so they can feel out how much $$ they can get out of people for the upgrade. Hate to break it to you guys – the answer is $0. We didn’t pay for OAS upgrades before this, we just paid you the generous 22% a year maintenance that got you your 51% profit margin this year. If you’re retiring OAS for BEA in all but name, we expect to get the equivalent functionality for our continued 22%.

Oracle has two (well, three) clear to dos.

1. Figure out what BEA product bundles give functionality equivalent to old OAS bundles

2. Give those to support-paying customers

3. Profit. You’re making plenty without trying to upcharge customers. Don’t try it.

Tags: bea, fusion, OAS, oracle

Velocity 2009 – Best Tidbits

Jul.06, 2009 in Conferences, Velocity 2009 1 Comment

Besides all the sessions, which were pretty good, a lot of the good info you get from conferences is by networking with other folks there and talking to vendors. Here are some of my top-value takeaways.

Aptimize is a New Zealand-based company that has developed software to automatically do the most high value front end optimizations (image spriting, CSS/JS combination and minification, etc.). We predict it’ll be big. On a site like ours, going back and doing all this across hundreds of apps will never happen – we can engineer new ones and important ones better, but something like this which can benefit apps by the handful is great.

I got some good info from the MySpace people. We’ve been talking about whether to run our back end as Linux/Apache/Java or Windows/IIS/.NET for some of our newer stuff. In the first workshop, I was impressed when the guy asked who all runs .NET and only one guy raised his hand. MySpace is one of the big .NET sites, but when I talked with them about what they felt the advantage was, they looked at each other and said “Well… It was the most expeditious choice at the time…” That’s damning with faint praise, so I asked about what they saw the main disadvantage being, and they cited remote administration – even with the new PowerShell stuff it’s just still not as easy as remote admin/CM of Linux. That’s top of my list too, but often Microsoft apologists will say “You just don’t understand because you don’t run it…” But apparently running it doesn’t necessarily sell you either.

Our friends from Opnet were there. It was probably a tough show for them, as many of these shops are of the “I never pay for software” camp. However, you end up wasting far more in skilled personnel time if you don’t have the right tools for the job. We use the heck out of their Panorama tool – it pulls metrics from all tiers of your system, including deep in the JVM, and does dynamic baselining, correlation and deviation. If all your programmers are 3l33t maybe you don’t need it, but if you’re unsurprised when one of them says “Uhhh… What’s a thread leak?” then it’s money.

ControlTier is nice, they’re a commercial open source CM tool for app deploys – it works at a higher level than chef/puppet, more like capistrano.

EngineYard was a really nice cloud provisioning solution (sits on top of Amazon or whatever). The reality of cloud computing as provided by the base IaaS vendors isn’t really the “machines dynamically spinning up and down and automatically scaling your app” they say it is without something like this (or lots of custom work). Their solution is, sadly, Rails only right now. But it is slick, very close to the blue-sky vision of what cloud computing can enable.

And also, I joined the EFF! Cyber rights now!

You can see most of the official proceedings from the conference (for free!):

Tags: Conferences, velocity, velocityconf, velocityconf09

Velocity 2009 – Monday Night

Jul.06, 2009 in Conferences, Velocity 2009 1 Comment

After a hearty trip to Gordon Biersch, Peco went to the Ignite battery of five minute presentations, which he said was very good. I went to two Birds of a Feather sessions, which were not. The first was a general cloud computing discussion which covered well-trod ground. The second was by a hapless Sun guy on Olio and Fabian. No, you don’t need to know about them. It was kinda painful, but I want to commend that Asian guy from Google for diplomatically continuing to try to guide the discussion into something coherent without just rolling over the Sun guy. Props!

And then – we were lame and just turned in. I’m getting old, can’t party every night like I used to. (I don’t know what Peco’s excuse is!)

Tags: Conferences, velocity, velocityconf, velocityconf09

Velocity 2009 – Scalable Internet Architectures

Jul.06, 2009 in Application Performance Management, Conferences, Velocity 2009 Leave a Comment

OK, I’ll be honest. I started out attending “Metrics that Matter – Approaches to Managing High Performance Web Sites” (presentation available!) by Ben Rushlo, Keynote proserv. I bailed after a half hour to the other one, not because the info in that one was bad but because I knew what he was covering and wanted to get the less familiar information from the other workshop. Here’s my brief notes from his session:

Online apps are complex systems
A siloed approach of deciding to improve midtier vs CDN vs front end engineering results in suboptimal experience to the end user – have to take holistic view. I totally agree with this, in our own caching project we took special care to do an analysis project first where we evaluated impact and benefit of each of these items not only in isolation but together so we’d know where we should expend effort.
Use top level/end user metrics, not system metrics, to measure performance.
There are other metrics that correlate to your performance – “key indicators.”
It’s hard to take low level metrics and take them “up” into a meaningful picture of user experience.

He’s covering good stuff but it’s nothing I don’t know. We see the differences and benefits in point in time tools, Passive RUM, tagging RUM, synthetic monitoring, end user/last mile synthetic monitoring… If you don’t, read the presentation, it’s good. As for me, it’s off to the scaling session.

I hopped into this session a half hour late. It’s Scalable Internet Architectures (again, go get the presentation) by Theo Schlossnagle, CEO of OmniTI and author of the similarly named book.

I like his talk, it starts by getting to the heart of what Web Operations – what we call “Web Admin” hereabouts – is. It kinda confuses architecture and operations initially but maybe that’s because I came in late.

He talks about knowledge, tools, experience, and discipline, and mentions that discipline is the most lacking element in the field. Like him, I’m a “real engineer” who went into IT so I agree vigorously.

What specifically should you do?

Use version control
Monitor
Serve static content using a CDN, and behind that a reverse proxy and behind that peer based HA. Distribute DNS for global distribution.
Dynamic content – now it’s time for optimization.

Optimizing Dynamic Content

Don’t pay to generate the same content twice – use caching. Generate content only when things change and break the system into components so you can cache appropriately.

example: a php news site – articles are in oracle, personalization on each page, top new forum posts in a sidebar.

Why abuse oracle by hitting it every page view? updates are controlled. The page should pull user prefs from a cookie. (p.s. rewrite your query strings)
But it’s still slow to pull from the db vs hardcoding it.
All blog sw does this, for example
Check for a hardcoded php page – if it’s not there, run something that puts it there. Still dynamically puts in user personalization from the cookie. In the preso he provides details on how to do this.
Do cache invalidation on content change, use a message queuing system like openAMQ for async writes.
Apache is now the bottleneck – use APC (alternative php cache)
or use memcached – he says no timeouts! Or… be careful about them! Or something.

Scaling Databases

1. shard them
2. shoot yourself

Sharding, or breaking your data up by range across many databases, means you throw away relational constraints and that’s sad. Get over it.

You may not need relations – use files fool! Or other options like couchdb, etc. Or hadoop, from the previous workshop!

Vertically scale first by:

not hitting the damn db!
run a good db. postgres! not mySQL boo-yah!

When you have to go horizontal, partition right – more than one shard shouldn’t answer an oltp question. If that’s not possible, consider duplication.

IM example. Store messages sharded by recipient. But then the sender wants to see them too and that’s an expensive operation – so just store them twice!!!

But if it’s not that simple, partitioning can hose you.

Do math and simulate it before you do it fool! Be an engineer!

Multi-master replication doesn’t work right. But it’s getting closer.

Networking

The network’s part of it, can’t forget it.

Of course if you’re using Ruby on Rails the network will never make your app suck more. Heh, the random drive-by disses rile the crowd up.

A single machine can push a gig. More isn’t hard with aggregated ports. Apache too, serving static files. Load balancers too. How to get to 10 or 20 Gbps though? All the drivers and firmware suck. Buy an expensive LB?

Use routing. It supports naive LB’ing. Or routing protocol on front end cache/LBs talking to your edge router. Use hashed routes upstream. User caches use same IP. Fault tolerant, distributed load, free.

Use isolation for floods. Set up a surge net. Route out based on MAC. Used vs DDoSes.

Service Decoupling

One of the most overlooked techniques for scalable systems. Why do now what you can postpone till later?

Break transaction into parts. Queue info. Process queues behind the scenes. Messaging! There’s different options – AMQP, Spread, JMS. Specifically good message queuing options are:

ActiveMQ (Java)
OpenAMQ (C)
RabbitMQ (erlang)

Most common – STOMP, sucks but universal.

Combine a queue and a job dispatcher to make this happen. Side note – Gearman, while cool, doesn’t do this – it dispatches work but it doesn’t decouple action from outcome – should be used to scale work that can’t be decoupled. (Yes it does, says dude in crowd.)

Scalability Problems

It often boils down to “don’t be an idiot.” His words not mine. I like this guy. Performance is easier than scaling. Extremely high perf systems tend to be easier to scale because they don’t have to scale as much.

e.g. An email marketing campaign with an URL not ending in a trailing slash. Guess what, you just doubled your hits. Use the damn trailing slash to avoid 302s.

How do you stop everyone from being an idiot though? Every person who sends a mass email from your company? That’s our problem – with more than fifty programmers and business people generating apps and content for our Web site, there is always a weakest link.

Caching should be controlled not prevented in nearly any circumstance.

Understand the problem. going from 100k to 10MM users – don’t just bucketize in small chunks and assume it will scale. Allow for margin for error. Designing for 100x or 1000x requires a profound understanding of the problem.

Example – I plan for a traffic spike of 3000 new visitors/sec. My page is about 300k. CPU bound. 8ms service time. Calculate servers needed. If I varnish the static assets, the calculation says I need 3-4 machines. But do the math and it’s 8 GB/sec of throughput. No way. At 1.5MM packets/sec – the firewall dies. You have to keep the whole system in mind.

So spread out static resources across multiple datacenters, agg’d pipes.
The rest is only 350 Mbps, 75k packets per second, doable – except the 302 adds 50% overage in packets per sec.

Last bonus thought – use zfs/dtrace for dbs, so run them on solaris!

Tags: scalability, velocity, velocityconf, velocityconf09

Velocity 2009 – Hadoop Operations: Managing Big Data Clusters

Jul.01, 2009 in Cloud Computing, Conferences, Velocity 2009 Leave a Comment

Hadoop Operations: Managaing Big Data Clusters (see link on that page for preso) was given by Jeff Hammerbacher of Cloudera.

Other good references –
book: “Hadoop: The Definitive Guide”
preso: hadoop cluster management from USENIX 2009

Hadoop is an Apache project inspired by Google’s infrastructure; it’s software for programming warehouse-scale computers.

It has recently been split into three main subprojects – HDFS, MapReduce, and Hadoop Common – and sports an ecosystem of various smaller subprojects (hive, etc.).

Usually a hadoop cluster is a mess of stock 1 RU servers with 4x1TB SATA disks in them. “I like my servers like I like my women – cheap and dirty,” Jeff did not say.

HDFS:

Pools servers into a single hierarchical namespace
It’s designed for large files, written once/read many times
It does checksumming, replication, compression
Access is from from Java, C, command line, etc. Not usually mounted at the OS level.

MapReduce:

Is a fault tolerant data layer and API for parallel data processing
Has a key/value pair model
Access is via Java, C++, streaming (for scripts), SQL (Hive), etc
Pushes work out to the data

Subprojects:

Avro (serialization)
HBase (like Google BigTable)
Hive (SQL interface)
Pig (language for dataflow programming)
zookeeper (coordination for distrib. systems)

Facebook used scribe (log aggregation tool) to pull a big wad of info into hadoop, published it out to mysql for user dash, to oracle rac for internal…
Yahoo! uses it too.

Sample projects hadoop would be good for – log/message warehouse, database archival store, search team projects (autocomplete), targeted web crawls…
As boxes you can use unused desktops, retired db servers, amazon ec2…

Tools they use to make hadoop include subversion/jira/ant/ivy/junit/hudson/javadoc/forrest
It uses an Apache 2.0 license

Good configs for hadoop:

use 7200 rpm sata, ecc ram, 1U servers
use linux, ext3 or maybe xfs filesystem, with noatime
JBOD disk config, no raid
java6_14+

To manage it –

unix utes: sar, iostat, iftop, vmstat, nfsstat, strace, dmesg, friends

java utes: jps, jstack, jconsole
Get the rpm! www.cloudera.com/hadoop

config: my.cloudera.com
modes – standalong, pseudo-distrib, distrib
“It’s nice to use dsh, cfengine/puppet/bcfg2/chef for config managment across a cluster; maybe use scribe for centralized logging”

I love hearing what tools people are using, that’s mainly how I find out about new ones!

Common hadoop problems:

“It’s almost always DNS” – use hostnames
open ports
distrib ssh keys (expect)
write permissions
make sure you’re using all the disks
don’t share NFS mounts for large clusters
set JAVA_HOME to new jvm (stick to sun’s)

HDFS In Depth

1. NameNode (master)
VERSION file shows data structs, filesystem image (in memory) and edit log (persisted) – if they change, painful upgrade

2. Secondary NameNode (aka checkpoint node) – checkpoints the FS image and then truncates edit log, usually run on a sep node
New backup node in .21 removes need for NFS mount write for HA

3. DataNode (workers)
stores data in local fs
stored data into blk_<id> files, round robins through dirs
heartbeat to namenode
raw socket to serve to client

4. Client (Java HDFS lib)
other stuff (libhdfs) more unstable

hdfs operator utilities

safe mode – when it starts up
fsck – hadoop version
dfsadmin
block scanner – runs every 3 wks, has web interface
balancer – examines ratio of used to total capacity across the cluster
har (like tar) archive – bunch up smaller files
distcp – parallel copy utility (uses mapreduce) for big loads
quotas

has users, groups, permissions – including x but there is no execution, but used for dirs
hadoop has some access trust issues – used through gateway cluster or in trusted env
audit logs – turn on in log4j.properties

has loads of Web UIs – on namenode go to /metrics, /logLevel, /stacks
non-hdfs access – HDFS proxy to http, or thriftfs
has trash (.Trash in home dir) – turn it on

includes benchmarks – testdfsio, nnbench

Common HDFS problems

disk capacity, esp due to log file sizes – crank up reserved space
slow but not dead disks and flapping NICS to slow mode
checkpointing and backing up metadata – monitor that it happens hourly
losing write pipeline for long lived writes – redo every hour is recommended
upgrades
many small files

MapReduce

use Fair Share or Capacity scheduler
distributed cache
jobcontrol for ordering

Monitoring – They use ganglia, jconsole, nagios and canary jobs for functionality

Question – how much admin resource would you need for hadoop? Answer – Facebook ops team had 20% of 2 guys hadooping, estimate you can use 1 person/100 nodes

He also notes that this preso and maybe more are on slideshare under “jhammerb.”

I thought this presentation was very complete and bad ass, and I may have some use cases that hadoop would be good for coming up!

Tags: cloudera, Conferences, data, hadoop, velocity, velocityconf, velocityconf09

« Previous Page — « previous entries

next entries » — Next Page »

Welcome to WebAdminBlog!

This blog site is run by Josh Sokol, the Founder and CEO of SimpleRisk, a free tool for Governance, Risk Management, and Compliance. Josh is a former Web Admin and Information Security Program Owner of National Instruments.

Categories
Recent Posts
Recent Comments
devops
Links
Security
Tags
21ct agile amazon analysis application appsec attack aws browser cloud Cloud Computing code Conferences data devops ec2 firewall google hansen internet lynxeon malware Management network Operations owasp PCI performance project rsnake SaaS secure Security strategies velocity velocity08 velocityconf velocityconf08 velocityconf09 Virtualization vpn vulnerability waf web wifi
Categories
- Advertising (2)
- Application Performance Management (14)
- Automation (4)
- Browsers (4)
- Cloud Computing (9)
  - Elastic Compute Cloud (3)
- Conferences (64)
  - BSides Austin 2013 (1)
  - BSides Austin 2016 (1)
  - OWASP AppSec DC 2009 (16)
  - OWASP AppSec NYC 2008 (18)
  - OWASP LASCON 2017 (1)
  - OWASP LASCON 2018 (1)
  - TRISC 2009 (8)
  - Velocity 2008 (8)
  - Velocity 2009 (8)
- Content Management (2)
- Featured (3)
- Green Computing (1)
- High Availability (1)
- Log Management (2)
- Management (4)
- Monitoring (4)
- Networking (12)
  - Firewalls (4)
  - NetFlow (4)
- Operating Systems (2)
  - Linux (2)
  - Mac OSX (1)
  - Unix (2)
- Operations (11)
- Popular (2)
- SaaS (2)
- Sarcasm (1)
- Search (1)
  - Enterprise Search (1)
- Security (75)
  - Access Management (1)
  - Capture the Flag (4)
  - Cloud Computing (4)
  - Compliance (1)
  - Disaster Recovery (1)
  - Malware (4)
  - Metrics (2)
  - OWASP (2)
  - PCI (2)
  - Phishing (2)
  - Physical (1)
  - Risk Management (2)
  - Virtualization (3)
  - Web Application Security (32)
    - Dynamic Analysis (1)
    - Static Analysis (1)
  - Wireless Networks (5)
- Service-Oriented Architecture (1)
- Software and Tools (15)
  - Crashplan (1)
  - Drobo (1)
  - GRC (1)
- Training (2)
- Uncategorized (1)
- Virtualization (4)

Blogroll
- Agile Operations Blog
- Agile Testing
- Agile Web Operations
- Amazon Web Services Blog
- dev2ops – Web Ops at Scale
- Gilligan on Data Web Analytics pro tips
- ISSA Home The Information Systems Security Association (ISSA)® is a not-for-profit, international organization of information security professionals and practitioners.
- Kitchen Soap, A WebOps Blog
- Michael Howard's Blog Software security guy at Microsoft.
- National Instruments Home The majority of the contributers here are current or past NI employees.
- OWASP Home The Open Web Application Security Project (OWASP) is a worldwide free and open community focused on improving the security of application software.
- RSnake's Blog ha.ckers.org web application security lab
- Server Fault
- Steve Souders’ Blog Google High Performance Guru
- The Madstop
- The Open Minded Enterprise
- The Simple Logic
- Transparent Uptime blog
Archives
- March 2019
- October 2017
- April 2016
- January 2016
- December 2015
- May 2015
- November 2014
- August 2014
- June 2014
- May 2014
- October 2013
- September 2013
- August 2013
- May 2013
- March 2013
- February 2013
- October 2012
- May 2011
- April 2011
- December 2010
- July 2010
- June 2010
- April 2010
- March 2010
- February 2010
- January 2010
- November 2009
- September 2009
- July 2009
- June 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
Tag Cloud
21ct agile amazon analysis application appsec attack aws browser cloud Cloud Computing code Conferences data devops ec2 firewall google hansen internet lynxeon malware Management network Operations owasp PCI performance project rsnake SaaS secure Security strategies velocity velocity08 velocityconf velocityconf08 velocityconf09 Virtualization vpn vulnerability waf web wifi

Real Web Admins. Real World Experience.

Optimizing Dynamic Content

Scaling Databases

Networking

Service Decoupling

HDFS In Depth

MapReduce

Welcome to WebAdminBlog!

Categories

Recent Posts

Recent Comments

devops

Links

Security

Tags

Categories

Blogroll

Archives

Tag Cloud