How to develop software faster and have more stable releases?
How to develop software faster and have more stable releases?
Self introduction
- Gábor Szabó @szabgab
- Help organizations generate value faster in a sustainable way.
- Code Mavens Meetup
Goal of the company / organization
Goal of the company / organization
- More value to customer
- More money to the shareholder
Goals of employees
Goals of employees
-
Autonomy
-
Mastery
-
Purpose
-
Location. (Close to home)
-
Good salary.
-
Job stability.
-
Interesting, Professional challenge.
-
Add value - make a contribution to something valuable.
-
Learn new things that will also be valuable at another company.
-
Enjoy their time at work (no late hours, no tension).
-
Working with nice people.
-
Having a good manager.
-
Being in the loop.
-
Being respected, acknowledged and recognized.
-
Working without wasting efforts. (More effective)
-
Good hardware and software and office environment.
Goals - Contradiction?
- Satisfied engineers create more value.
Business needs for change
-
Reduce time to market
-
Increase feature throughput
-
Decrease cost
-
Increase quality
Value creation - Time is Money
-
The sooner the better.
-
Getting USD 1,000,000 ten years from now is great.
-
Getting it one year from now is much better.
-
Getting it next week is even better than that.
-
NPV Net Present Value
So "sooner" has a higher value than later.
How to Faster, Cheaper, Better?
- Lean
- Agile
- Scrum
- Kanban
- FAST
- SAFe
- Spine
- XP
- DevOps
- DevSecOps
Value creation
Product types
-
How frequently are they upgradable?
-
Hardware
-
Embedded software
-
On-premise application/device
-
Desktop Application
-
Mobile Application
-
Web Application
Fast or stable?
-
Need long QA cycle to have stable product.
-
Development vs. Operations.
Release frequency
Deploy per day VS value (Value creation)
- More value sooner
- Faster feedback
- Can the clients acctually absorb the changes?
- Can we deliver the frequent changes?
- How can we ensure the quality remains high or even increases?
MTTR - Mean time to repair
- The more frequent we can release the sooner we can fix issues
Old model
-
Waterfall with Big Bang release
-
Requirements (clients)
-
Design
-
Development: many months
-
QA: several months, bugs, rework, etc
-
Operations
-
Information Security
Wall of Confusion
by Andrew Clay Shafer.
The business cost
- Wasted time, cost fixing bugs.
- Low of customer trust due to bugs.
- Long development time.
- Fear of release.
The human cost
- Long working hours.
- Reduced quality of life.
- Powerless in the organization.
- Low employee satisfaction.
- High turnover rate.
High Performing organizations
-
Multiple deploys per day vs. one per month
-
Commit to deploy in less than 1 hour vs. one week
-
Recover from failure in less than 1 hour vs. one day
-
Change failure rate of 0-15% vs. 31-45%
-
source Puppet labs report
High Performing organizations
-
2.5x more likely to exceed business goals
-
Profitability
-
Market share
-
Productivity
-
source Puppet labs report
Getting faster
Release once a year ==============> Amazon speed (more than 1 per second)
- Priorities
- Small Batch size
- Reduce Multitasking
- Architecture
- Automated Tests
- Refactoring
- Build only what you need
- Create fast feedback loop
Priorities
- Instead of building 5 features - one feature each person
- Build 2-3 features first and when you are done build the remaining feature.
1 =============>
1 =============>
1 =============>
1 =============>
1 =============>
2 ======>
2 ======>
1 =============>
2 ======>
2 ======>
- You get some value (and feedback) earlier.
- Incremental delivery.
Small Batch size
Example: fill envelops - you have 10 envelops to fill with a letter. You have 4 steps
- Fold the letter.
- Put the letter in the envelop.
- Write the address on the envelop.
- Seal the envelop.
Reduce Multitasking
Multitasking Exercise
Exercise: Write down 3 sets of values while measuring the time to get the following results:
0 1 2 3 4 5 6 7 8 9
a b c d e f g h i j
I II III IV V VI VII VIII IX X
First time write these horizontally:
- First write down the Arabic numbers.
- Then the Latin letters
- Then the Roman numbers.
In the next round write the same ones down, but this time start by writing down the first value of each, then the second value of each. (So you'd first write down 0, a, I, then 1, b, II etc.)
Observe how much longer the second method takes.
Build only what you need
- When asked to add a feature, first try to figure out Why? What is the problem that needs to be solved?
- If possible use an existing tool or service. (Open Source, Cloud)
- Focus on building what you really need.
Create fast feedback loops
- Learning from mistakes made half a year earlier is costly, painful, and never really happens.
- Learning from mistakes made 10 min ago is much easier and more valuable.
Feedback Techniques
- From the client after a year of development.
- Telemetry: servers, client interaction, errors, failures... Log and monitor everything.
- Continuous Deployment (CD).
- Code reviews.
- Continuous Integration (CI).
- Build system.
- Test automation (unit and other automated tests).
- Pair programming.
- Mob programming.
Learn from the mistakes
- Blameless post mortem. Etsy Morgue tool.
- Learning organization.
- Transform local discoveries into global improvements. ( US navy reactors. )
Retrospectives
-
After each sprint.
-
Once every few weeks.
-
Primarily about the process.
Daily feedback meetings
- What did you finish yesterday?
- What will you finish today?
- What's blocking you? ( What's your red flag? )
Continuous Integration (CI)
- Nightly build?
- Make sure the code is always releasable/deployable.
- Standardized environments. (Development, testing)
Toyota Andon cord
- Swarm and solve problems, build and spread new knowledge.
Andon cord
Test-Driven Development
- Tests provide a solid ground.
- More confidence in our changes.
- Tests make better code.
- Tests make better systems (catch bugs earlier).
Optimizing Developer Effort
Microsoft research shows that developers on a mature code-base spend their time:
-
75% reading code
-
20% modifying code
-
5% writing new code
Pair Programming
- 2 people at the same computer
- Typing time?
Refactoring
- Clean up the mess!
Architecture
- Monolith good for the startups
- SOA - Service Oriented Architecture
- Conway's law
Conway's Law
- Mel Conway 1967
- Organization determines architecture.
- Modular system requires modular organization.
Small Teams
- 2 Pizza team (Jeff Bezos) (Full-service)
- Align to Business Domains
End-to-end Ownership
- You build it, you run it. Werner Vogels (CTO of Amazon)
The same team
-
writes code
-
checks quality
-
runs the service
-
If you are the one who needs to wake up at night for a bug, you will fix it soon.
Project boundaries
- The majority of the work should be inside of each team.
Design
- Design for both external and internal customers.
- The external pays for it but the internal also uses it.
- Optimize for downstream work center.
Features
- Testability
- Deployability
- Architecture
- Security
- Performance
- Stability
- Configurability
Continuous Deployment (CD)
- Repeatable deployment pipeline.
Decouple deployment
-
Decouple deploy from release.
-
Decouple delivery from deploy.
-
Feature flags.
-
Dark launches
-
Deployment circles - Canary release - Cluster immune systems
-
Blue-green deployment
-
AB testing
Blue-green deployment
- Decouple changes to the database and changes to the application.
- Duplicate the whole stack.
Canary release
- Deploy to only a few servers, monitor
- Enable only to a subset of users, monitor
Infrastructure as code
- Requirements files.
- Vagrant configurations.
- Ansible/Chef/Puppet
- Containers - Docker Images
- Container Orchestration - Kubernetes
DevOps loop
- Requirements
- Design / Plan
- Development / Code
- Build
- InfoSec
- QA
- Release
- Deploy
- Operations
- Monitoring
Continuous Improvement
- Continuous Improvement.
- Continuous Learning.
This needs investment both time and money and it leads to change.
Hierarchy of abstractions
- VPS (GCE - Google Cloud Engine)
- Kubernetes (GKE - Google Kubernetes Engine)
- Paas (GAE - Google App Engine)
- Serverless (GCF - Google Cloud Functions)
Resilience testing
- Intentionally cause problems during the work day and see how the tools and the team react.
- Randomly kill processes and compute servers in production to see how the monitoring system and the whole team reacts
- Do this often during work hours and reduce the risk of such thing happening during the nights.
- Fix any issues. Learn.
- Netflix Chaos Monkey
What is in there for me, the developer?
Most engineers I know want to enjoy work and be proud of their accomplishments.
-
Safer work place - no fear of change, no fear of release.
-
Better working environment.
-
Much less bugs and rework of the same code.
-
Allow you to do other work when this is done
-
Learn new things.
-
Learn better development practices that will be relevant in your next job as well.
The transformation process
-
It can take years.
-
We would like to see results soon.
-
Top down support.
-
Bottom up experimentation, feedback.
-
Get support from the top: VP RnD, CxO.
-
Work on team level.
Theory X and Theory Y
by Douglas McGregor
- X thinks people are lazy, need supervision.
- Y thinks people can be autonomous if trusted.
Time boxing experiment
- Use time boxing for experimentation to reduce risk and increase chances of acceptance.
Greenfield projects VS brownfield projects
Greenfield projects are easier to get started with.
Brownfield projects have:
-
Technical debt.
-
Legacy code.
-
Unsupported platforms.
-
Users.
Getting Started
The two most important aspects are:
- Value Creation
- Feedback
Start with these.
- Discuss the things you value.
- Then build in short feedback loops in your process.
Top down approach
- Pick the value stream to be the first to convert:
- Start with the team that has the most open attitude to the new way of work.
- Educate people. Both about the ideas and about the techniques.
- Find the first bottleneck that is in your power to change.
- Reinforce learning culture.
- Instead of adding more people to the team, improve the way the team works.
Team level approach
- Start writing tests.
- For every new feature, for every bug.
- Include time for refactoring.
- Set up Continuous Integration
- Work on standardized environment (Requirements, VMs, Containers)
At first this will take you extra time. Later you will see the value. Put it in the estimates. They are part of your job.
- Allocate at least 20% of your time to this.
Resources
Surveys
- State of Agile Collabnet and VersionOne survey.
- State of DevOps Puppet labs report.
Books
- The Phoenix project by Gene Kim, Kevin Behr, George Spafford
- The DevOps handbook by Gene Kim, Patrick Debois, John Wills, Jez Humble, John Allspaw
- Continuous delivery by Jez Humble and Dave Farley
- Lean Software Development: An Agile Toolkit by Mary Poppendieck and Tom Poppendieck.
- The Goal: A Process of Ongoing Improvement by Eliyahu M. Goldratt and Jeff Cox.
- Lean Thinking: Banish Waste and Create Wealth in Your Corporation by James P. Womack and Daniel T. Jones.
- The Fifth Discipline: The Art & Practice of The Learning Organization by Peter M. Senge.
- Beyond Legacy Code by David Scott Bernstein.
- Switch: How to Change Things When Change Is Hard by Chip Heath and Dan Heath
- Drive: The Surprising Truth About What Motivates Us by Daniel H. Pink
- Flow: The Psychology of Optimal Experience by Mihaly Csikszentmihalyi.
- Mob programming by Woody Zuill and Kevin Meadows.
- Forging Python by Miki Tebeka.
Videos
- Velocity and volume (or speed wins) Adrian Cockcroft.
- Moving Fast At Scale video by Randy Shoup.
- Moving Fast At Scale slides by Randy Shoup.
Blog posts
-
No, You are not Dumb! Programmers do spend a lot of time Understanding Code... links to surveys.
-
Developers Should Abandon Agile by Ron Jeffries.
-
Making Work Visible: Exposing Time Theft to Optimize Work & Flow by Dominica DeGrandis
Contact me
- Gábor Szabó @szabgab
Sources
http://www.shmula.com/about-peter-abilla/what-is-andon-in-the-toyota-production-system/
http://www.shmula.com/andon-system-what-does-it-tell-you/20584/