Software Engineer Monitoring
Software Engineer Monitoring04/19/2023 ● Cleveland, Ohio ● Contract
To Apply for this Job Click Here
Job Title: Software engineering and Monitoring
Education: Computer Science is preferred
Specific Skill Set: Software engineering and Monitoring skillset – Monitoring tools like Dynatrace and /or Open telemetry or any other opensource toolsets.
A traceable, event-based Observability enables exploratory investigation when issues occur, for causes both known and unknown. It helps teams troubleshoot issues without first having to predict what or how problems may happen, especially with complex, multi-layer distributed applications connected with microservices.
Observability also helps teams improve their understanding of how customers use our digital products. Product teams use that awareness to influence future development. The Observability contributes to the overall strategic vision of the organization for Observability capabilities, processes, patterns, and tooling. This role will lead efforts working closely with the Product and Infrastructure teams to ensure that all aspects of the telemetry from applications, business events, appliances and infrastructure are accurately received, tagged, and reported. The role involves leading efforts to maintain observability platform, and ensure it is optimized and operating within SLA’s and SLO’s. Acting as SME in Observability practices for the enterprise and providing services/solutions across the enterprise that enables businesses to achieve and sustain a higher SLA by improving quality of software, reducing problem determination/down time and over all enhancing the end user experience.
Strategy & Planning
Socializing the Observability capabilities, processes, and Technology with the various application groups.
Working with various product and business groups to help determine SLIs, SLOs and SLAs for products, applications, and services offered to the customer. Establishing strategies, processes, and tooling to adhere to the SLAs.
Lead efforts to provide self-service capabilities to analyze and visualize Observability data providing End to End visibility to Products and application performance (this will include Dashboards, Alerting, automated incident response capabilities etc,.).
Providing strategic roadmap for Observability maturity including recommendations on tooling, capabilities to support the ever-growing enterprise needs and new products.
Create, support, and sustain methods and procedures to measure outcomes of Observability practices.
Provide ability for developers to use tools to identify symptoms and diagnose application issues by providing them requisite access levels and training
Develop and document Observability standards, procedures, and best practices for using the tool, provide education in the tools use.
Clearly communicate to IT and business stakeholders regarding performance-related recommendations and tradeoffs.
Partner with QA team, assisting with creating and refining effective performance test objectives, test plans, and scenarios that help the organization achieve quality requirements for applications.
Acquisition & Deployment
Work with business to provide guidance for developing KPI’s in support of strategize business initiatives.
Establish measurements for KPI’s and related business transactions of interest and develop executive dashboards required to observe application, user behavior, and user-interaction for business-critical functions.
Work with development and architecture teams to manage Observability data collection, analysis, and visualization for critical applications through the lifecycle of the application.
Working on continuous improvements of Observability capabilities, providing technical guidance to development teams and aid in triaging production problems
Independently utilizes Observability tools to detect, isolate, and resolve issues effecting positive user experience and user interaction with the applications.
Assist in major application and/or security incident troubleshooting.
Contribute to aspects of the solution delivery lifecycle in prototyping, capacity modeling, performance driven design, profiling, performance testing, availability management, and troubleshooting.
Guide operations and support team on building and refining application behavior data capture and reporting for Production systems, and corresponding processes.
Provide and design cross-team training opportunities.
Improve knowledge and skills in Enterprise Devops team to become more competent and able to accept greater responsibilities.
Install and configure software products. Ensure compatibility between target product, operating system, and other resident software. Apply maintenance according to best practices.
Lead in capacity planning and performance management activities.
Contribute to the development of service level goals and objectives.
Develop and prepare metrics that measure services rendered.
Identify opportunities to improve service levels and/or minimize support efforts.
Perform standard configuration, management, and maintenance tasks in support of web resources.
Mentor and/or provide guidance to all members of the team.
Participate in disaster planning/mitigation/recovery.
Conduct Product Proof-of- Concepts.
Assist with other projects as may be required to contribute to the efficiency and effectiveness of the group and other business/technical entities.
Assist and participate with Change Management preparations and implementations, providing technical subject matter expertise.
Attend, and periodically lead meetings in participation with the team.
Participate in hiring activities and fulfilling affirmative action obligations and ensuring compliance with the equal employment opportunity policy.
Provide periodic 24/7 on-call support of specific functions.
10+ Years Experience
5+ Years of Software Engineering experience 5+ Years of experience in Observability / Monitoring software Exposure to SRE