Manage and lead the design, implementation, and maintenance of AI infrastructure systems for reliable operations of VNAP's AI prediction services and training environments.
1. Co-work with IT/CIM infra teams, which host CPU/GPU application servers and database services such as VM/K8S, Kafka, MongoDB, Oracle middleware for VNAP, to ensure high availability and reliability through well-established monitor metrics and alarms.
2. Design and implement infra-as-code tools like Ansible and Terraform to establish auto-recovery mechanisms to minimize tool idle/hold lot impacts caused by system issues.
3. Develop and maintain applications using C#/Delphi/Python on top of those infrastructure systems.
4. Work location : Hsinchu or Taoyuan
5. Hiring Organization: IMC
1. Master's degree in Computer Science, Information Technology, or related field.
2. Minimum 3 years of experience in infrastructure and system administration/operations.
3. Strong understanding and hands-on experience of message queuing systems and SQL/No-SQL databases, such as Kafka, MongDB, Oracle and MariaDB.
4. Experience in operational system administration, such as Windows servers and Linux distributions.
5. Strong experience in networking technologies including firewalls, nginx load balancing, and virtual IP setup.
6. Experience in operation monitor systems such as Zabbix, Prometheus, and Graphana.
7. Experience in infra-as-code tools like Ansible and Terraform.
8. Experience in application development using C#/Delphi/Python on top of AI infrastructure system components for auto recovery.
9. Excellent communication and interpersonal skills for cross division/department cooperation.
Diversity, Equity and Inclusion (DE&I) reflects TSMC’s core values and business philosophy and is essential for our future success. Our commitment to DE&I allows us to create an environment where every employee, regardless of gender, age, disability, religion, race, ethnicity, nationality, political affiliation, or sexual orientation, can bring their unique perspective and experiences to work, enabling us to drive profitability, increase productivity, and unleash innovation. To strive to create a workplace that is equitable and accessible to all employees, we also provide reasonable accommodations for qualified individuals with disabilities. We are committed to fostering an inclusive culture where every employee feels valued and empowered to contribute to our mission and provide excellent service to our global customers.