HOME > 상세정보

상세정보

Learning heuristics for fast and precise Java points-to analysis

Learning heuristics for fast and precise Java points-to analysis

자료유형
학위논문
개인저자
정세훈, 鄭勢勳
서명 / 저자사항
Learning heuristics for fast and precise Java points-to analysis / Sehun Jeong
발행사항
Seoul :   Graduate School, Korea University,   2020  
형태사항
vi, 80장 : 도표 ; 26 cm
기타형태 저록
Learning Heuristics for Fast and Precise Java Points-to Analysis   (DCOLL211009)000000127338  
학위논문주기
학위논문(박사)-- 고려대학교 대학원: 컴퓨터·전파통신공학과, 2020. 2
학과코드
0510   6YD36   379  
일반주기
지도교수: 차성덕  
부록: A. Proof of theorem 4.1.1, B. Learned boolean formulas, C. 11 remaining features for heap abstraction heuristics 외  
서지주기
참고문헌: 장 74-80
이용가능한 다른형태자료
PDF 파일로도 이용가능;   Requires PDF file reader(application/pdf)  
비통제주제어
Points-to analysis , Java , Static analysis,,
000 00000nam c2200205 c 4500
001 000046026252
005 20200428142813
007 ta
008 200103s2020 ulkd bmAC 000c eng
040 ▼a 211009 ▼c 211009 ▼d 211009
085 0 ▼a 0510 ▼2 KDCP
090 ▼a 0510 ▼b 6YD36 ▼c 379
100 1 ▼a 정세훈, ▼g 鄭勢勳
245 1 0 ▼a Learning heuristics for fast and precise Java points-to analysis / ▼d Sehun Jeong
260 ▼a Seoul : ▼b Graduate School, Korea University, ▼c 2020
300 ▼a vi, 80장 : ▼b 도표 ; ▼c 26 cm
500 ▼a 지도교수: 차성덕
500 ▼a 부록: A. Proof of theorem 4.1.1, B. Learned boolean formulas, C. 11 remaining features for heap abstraction heuristics 외
502 1 ▼a 학위논문(박사)-- ▼b 고려대학교 대학원: ▼c 컴퓨터·전파통신공학과, ▼d 2020. 2
504 ▼a 참고문헌: 장 74-80
530 ▼a PDF 파일로도 이용가능; ▼c Requires PDF file reader(application/pdf)
653 ▼a Points-to analysis ▼a Java ▼a Static analysis
776 0 ▼t Learning Heuristics for Fast and Precise Java Points-to Analysis ▼w (DCOLL211009)000000127338
900 1 0 ▼a Jeong, Se-hun, ▼e
900 1 0 ▼a 차성덕, ▼g 車聖德, ▼e 지도교수
945 ▼a KLPA

전자정보

No. 원문명 서비스
1
Learning heuristics for fast and precise Java points-to analysis (84회 열람)
PDF 초록 목차

소장정보

No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 과학도서관/학위논문서고/ 청구기호 0510 6YD36 379 등록번호 123063745 도서상태 대출가능 반납예정일 예약 서비스 B M
No. 2 소장처 과학도서관/학위논문서고/ 청구기호 0510 6YD36 379 등록번호 123063746 도서상태 대출가능 반납예정일 예약 서비스 B M

컨텐츠정보

초록

This dissertation demonstrates that data-driven techniques can significantly
outperform hand-crafted approaches by experts in generating effective
abstraction heuristics for Java pointer analysis. Heuristics for
context-sensitivity and heap abstraction are essential ingredients of
effective analysis. Existing heuristics generally have to balance trade-offs
between scalability and precision, whereas the proposed approach is highly
scalable while achieving acceptably low levels of reduction in precision: 1)
The process of generating heuristics, based on machine learning approaches,
is automated in that complex heuristics are derived as a combination of
atomic features; 2) Resulting heuristics, when applied to widely used DaCapo
benchmark programs, are highly efficient compared to the ones manually
crafted by human experts; and 3) Precision of the learned heuristics is also
superior to the hand-tuned rules when applied to context-sensitive analysis.
When applied to the heap abstraction problem, precision level is almost
equivalent in that additional alarms generated by data-driven heuristics
never exceeded by more than 4%.

Precise context-sensitive analysis is highly expensive, and the existing
state-of-the-art techniques made impressive contributions on scalability.
Unfortunately, their precisions are still unsatisfactory. In our approach, we
determine the proper level of context-sensitivity rigor best suited to each
method. Some methods are sufficient with highly efficient but imprecise
analysis. On the other hand, deeper context-sensitivity needs to be applied
on some methods to achieve a desired level of precision. In order to choose
the optimal strategy, we identified a set of 25 atomic features that are
useful as heuristic components. Examples include the followings: Does a
method return void data type? Does a method signature contain certain words?
A greedy learning algorithm iteratively performs pointer analysis on training
programs to identify the optimal combination that minimizes the cost while
not losing precision. Resulting heuristics are represented as a collection of
Boolean formula in disjunctive normal form. Empirical evaluation on DaCapo
benchmark suites convincingly demonstrates that data-driven approach
outperforms the results achieved by the state-of-the-art manual heuristic
called introspective analysis.

Likewise, we applied data-driven approach to heap abstraction problem for
which type-based and allocation-site-based abstraction techniques are known.
The former is efficient but imprecise to perform while the latter is precise
but expensive. Our goal is to apply the former technique by default and apply
the latter only when necessary. Decision criteria is derived by a supervised
learning algorithm in which a set of 16 atomic features are used as heuristic
components. Examples include the number of different typed heaps that a field
can point to or the number of generic methods. Empirical evaluation on DaCapo
benchmark suite convincingly demonstrates that data-driven approach
significantly outperforms the results of the state-of-the-art
type-consistency-based heap abstraction. For example, some benchmark programs
(e.g., bloat, chart, or eclipse) that could not be
completed in 5 hours can now be completed in less than 3 minutes.

목차

1 Introduction 1
  1.1 Challenges 2
  1.2 Contributions  2
  1.3 Organization 5
2 Related Work 7
  2.1 Context Sensitivity 7
  2.2 Heap Abstraction  10
3 Background 13
  3.1 Points-to Analysis in Datalog 13
  3.2 Parameterizing the Points-to Analysis  17
    3.2.1 Parametric context-sensitivity  17
    3.2.2 Parametric heap abstraction  19
4 Data-driven Context Sensitivity 21
  4.1 Approach  21
    4.1.1 Modeling of context-Sensitive points-to analysis  21
    4.1.2 Goal 23
    4.1.3 Modeling of context-selection heuristics  24
    4.1.4 The learning problem  25
    4.1.5 The learning algorithm  26
    4.1.6 Atomic features  33
  4.2 Experiments 35
    4.2.1 Effectiveness and generalization  36
    4.2.2 Adequacy of our learning approach  43
    4.2.3 Learned features  45
    4.2.4 Threats to validity  46
5 Data-driven Heap Abstraction 49
  5.1 Approach  49
    5.1.1 Parametrization of heap abstraction for points-to analysis 49
    5.1.2 Goal 51
    5.1.3 Modeling of our heap clustering heuristics  52
    5.1.4 Adequacy of the heap abstraction heuristic H R  52
    5.1.5 The learning procedure  54
    5.1.6 The five important features for heap abstraction 58
  5.2 Experiments  62
    5.2.1 Effectiveness and generalization  62
    5.2.2 Important features 63
    5.2.3 Overhead of our approach 65
6 Conclusions and Future Work 67
  6.1 Conclusions 67
  6.2 Future Work 68