고려대학교 DMQA 연구실

인간 전문지식 기반 인공지능 vs 데이터 기반 인공지능

2025년 2월 12일 오후 3:12
조회수: 120

Written by

김성범 교수님

인공지능은 사람의 전문지식에 기반한 규칙과 데이터 기반의 규칙 모두 포함한다. 이는 주어진 문제를 해결할 수 있는 규칙을 사람이 만드느냐, 아니면 데이터가 만드느냐의 차이다. 물론 어떤 방식으로든 규칙이 만들어지면 컴퓨터가 이를 자동적으로 실행한다. 사람이 규칙을 만들기 위해서는 많은 시간, 경험, 시행착오가 필요하다. 반면 데이터 기반 규칙은 데이터와 그 규칙을 만드는 알고리즘이 필요하다. 알고리즘은 사람이 만든다. 반도체 이미지로부터 불량을 판단하는 규칙을 만드는 예를 들어 보자. 반도체 불량을 판단할 수 있는 규칙은 오랜 경험을 가진 엔지니어들이 많은 시행착오를 거쳐 만들어진다. 사람이 만든 규칙은 정확하지만 다소 주관적이고 일관성이 없을 수 있다. 특히 새로운 불량이 나오면 기존 규칙으로는 탐지하기 어렵다. 즉, 유연성이 떨어진다. 반면, 데이터 기반 규칙은 많은 반도체 이미지를 보여주고 어떤 것이 정상이고 어떤 것이 불량인지만 알려 주면 스스로 만들어진다. 새로운 불량이 나와도 이를 해결할 수 있는 방법이 사람에 비해 휠씬 쉽고 유연하다. 물론 많은 양의 데이터와 정확한 불량/정상 레이블이 필요하다. 사람이 만든 규칙은 수많은 시행착오를 통해 만들어졌기 때문에 깊이가 있다. 다만 새로운 패턴에 대해 유연한 대처가 어렵다. 데이터로 만든 규칙은 사람에 비해 휠씬 유연하지만, 모델 구축 시 사용된 데이터에 포함되지 않은 패턴을 탐지하기는 어렵다. 결국 인간 규칙과 데이터 규칙은 계속 공존할 것으로 보인다. 다만 앞으로 더 많은 양질의 데이터가 쌓일 것이므로 데이터 기반 규칙의 비율이 점점 커질 것으로 예상한다.

Human expertise-based AI vs. data-driven AI

AI includes both human expertise-based rules and data-driven rules. The difference is whether the rules are created by humans or by data to solve a given problem. Of course, once a rule is created in either way, it is automatically executed by a computer. Creating rules based on human expertise requires significant time, experience, and trial and error. In contrast, data-driven rules are created using data and algorithms, the latter of which are designed by humans. Consider the example of creating a rule to recognize defects from images of semiconductors. While human-created rules are often accurate, they can be somewhat subjective and inconsistent. Additionally, they lack flexibility in detecting new defects as they emerge. On the other hand, data-driven rules generate themselves when provided with a large set of semiconductor images labeled as good or bad. As new defects arise, data-driven rules can adapt more easily and flexibly compared to human-created rules. However, this approach requires a substantial amount of data and accurate labeling. Human-created rules have depth because of the extensive trial and error involved in their development, but they are less adaptable to new patterns. Data-driven rules, while more flexible, may struggle to detect patterns not present in the training data. Ultimately, human rules and data-driven rules will likely continue to coexist. However, as more high-quality data is accumulated in the future, the proportion of data-driven rules is expected to increase.

Essay