(698e) Building Non-Parametric Models for Rsm: a Bayesian Approach and Its Application to Plasma Etching | AIChE

(698e) Building Non-Parametric Models for Rsm: a Bayesian Approach and Its Application to Plasma Etching

Authors 

Tatavalli Mittadar, N. - Presenter, University of Houston
Economou, D. J. - Presenter, University of Houston


Abstract

The response surface methodology (RSM) is a collection of tools for statistical design of experiments and numerical optimization techniques, used to optimize processes and products  ADDIN EN.CITE <EndNote><Cite><Author>Myers</Author><Year>2004</Year><RecNum>13</RecNum><record><rec-number>13</rec-number><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Myers, Raymond H. </author><author>Montgomery, Douglas C. </author><author>Vinning, G. Geoffrey</author><author>Borror, Connie M. </author><author>Kowalski, Scott M. </author></authors></contributors><titles><title> Response Surface Methodology: A retrospective and Literature Survey</title><secondary-title>Journal of Quality Technology</secondary-title></titles><periodical><full-title>Journal of Quality Technology</full-title></periodical><pages>53-70</pages><volume>36</volume><number>1</number><dates><year>2004</year></dates><urls></urls></record></Cite></EndNote>(Myers, Montgomery et al. 2004).  RSM entails a succession of experimentation/optimization cycles, repeated until an optimal solution is determined or resources have been exhausted.  In each cycle, experiments are conducted at selected values of the decision variables, and the experimental results are used to construct an input-output model (response surface) that will suggest the experimental conditions for the next cycle.  Parametric or nonparametric models can be used to approximate the relationship between decision variables (inputs) and the response (outputs) of interest.  A nonparametric class of models that has been successful for complicated response surfaces with multiple optima employs kriging approximators  ADDIN EN.CITE <EndNote><Cite><Author>Isaaks</Author><Year>1989</Year><RecNum>14</RecNum><record><rec-number>14</rec-number><ref-type name="Book">6</ref-type><contributors><authors><author>Isaaks, E. H.</author><author>Srivastava, R. M.</author></authors></contributors><titles><title>An Introduction to Applied Geostatistics</title></titles><dates><year>1989</year></dates><pub-location>New York</pub-location><publisher>Oxford University Press</publisher><urls></urls></record></Cite></EndNote>(Isaaks and Srivastava 1989) that have been used for design and analysis of computer experiments (DACE)  ADDIN EN.CITE <EndNote><Cite><Author>Santner</Author><Year>2003</Year><RecNum>15</RecNum><record><rec-number>15</rec-number><ref-type name="Book">6</ref-type><contributors><authors><author>Santner, Thomas J. </author><author>Williams, Brian J.</author><author>Notz, William I.</author></authors></contributors><titles><title>The Design and Analysis of Computer Experiments</title></titles><dates><year>2003</year></dates><publisher>Springer Series in Statistics</publisher><urls></urls></record></Cite></EndNote>(Santner, Williams et al. 2003).  DACE models have been used in conjunction with global optimization methodologies inspired by Kushner's criterion  ADDIN EN.CITE <EndNote><Cite><Author>Kushner</Author><Year>1964</Year><RecNum>16</RecNum><record><rec-number>16</rec-number><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Kushner, H. J.</author></authors></contributors><titles><title>A New Method of Locating the Maximum of an Arbitrary Multipeak Curve in the Presence of Noise</title><secondary-title>Journal of Basic Engineering</secondary-title></titles><periodical><full-title>Journal of Basic Engineering</full-title></periodical><pages>97-106</pages><volume>86</volume><dates><year>1964</year></dates><urls></urls></record></Cite></EndNote>(Kushner 1964).  According to standard methodology, the structure of a DACE model is determined by a preselected trend function and by the experimental data collected, whereas the values of the DACE model parameters are determined using a maximum-likelihood estimation (MLE) method detailed in  ADDIN EN.CITE <EndNote><Cite><Author>Pardo-Igu&apos;zquiza</Author><Year>1998</Year><RecNum>10</RecNum><record><rec-number>10</rec-number><ref-type name="Journal Article">17</ref-type><contributors><authors><author>Eulogio Pardo-Igu&apos;zquiza</author></authors></contributors><titles><title>Maximum Likelihood Estimation of Spatial Covariance Parameters</title><secondary-title>Mathematical Geology</secondary-title></titles><periodical><full-title>Mathematical Geology</full-title></periodical><pages>95-108</pages><volume>30</volume><number>1</number><dates><year>1998</year></dates><urls></urls></record></Cite></EndNote>(Pardo-Igu'zquiza 1998).  However, given that the DACE model structure relies on interpolation, on many occasions MLE produces a DACE response surface that is void of any useful information, in that the response surface is composed of the preselected trend function along with narrow peaks and valleys in small neighborhoods around points where experiments have been conducted.  Such a model is far from the actual response surface and cannot be used for optimization.  This phenomenon has been observed before  ADDIN EN.CITE <EndNote><Cite><Author>Sasena</Author><Year>2002</Year><RecNum>17</RecNum><record><rec-number>17</rec-number><ref-type name="Thesis">32</ref-type><contributors><authors><author>Sasena, M. J.</author></authors></contributors><titles><title>Flexibility and Efficiency Enhancements for Constrained Global Design Optimization with Kriging Approximations</title><secondary-title>Department of Mechanical Engineering</secondary-title></titles><volume>PhD Thesis</volume><dates><year>2002</year></dates><publisher>University of Michigan</publisher><urls></urls></record></Cite></EndNote>(Sasena 2002) and  ADDIN EN.CITE <EndNote><Cite><Author>Lin</Author><Year>2004</Year><RecNum>12</RecNum><record><rec-number>12</rec-number><ref-type name="Thesis">32</ref-type><contributors><authors><author>Yao Lin</author></authors><tertiary-authors><author>Prof. Farrokh Mistree</author></tertiary-authors></contributors><titles><title>An Efficient Robust Concept Exploration Method and Sequential Exploratory Experimental Design</title><secondary-title>Mechanical Engineering</secondary-title></titles><pages>780</pages><volume>Ph.D.</volume><dates><year>2004</year><pub-dates><date>July</date></pub-dates></dates><pub-location>Atlanta</pub-location><publisher>Georgia Institute of Technology</publisher><urls></urls></record></Cite></EndNote>(Lin 2004).  The suggested remedy in  ADDIN EN.CITE <EndNote><Cite><Author>Sasena</Author><Year>2002</Year><RecNum>17</RecNum><record><rec-number>17</rec-number><ref-type name="Thesis">32</ref-type><contributors><authors><author>Sasena, M. J.</author></authors></contributors><titles><title>Flexibility and Efficiency Enhancements for Constrained Global Design Optimization with Kriging Approximations</title><secondary-title>Department of Mechanical Engineering</secondary-title></titles><volume>PhD Thesis</volume><dates><year>2002</year></dates><publisher>University of Michigan</publisher><urls></urls></record></Cite></EndNote>(Sasena 2002) of conducting two experiments at points close to each other works well for one decision variable, but does not easily extend to the more interesting case of multiple decision variables.  In fact, the general suggestion in literature when dealing with multiple decision variables is to start RSM with a large number of experiments, in order to capture the shape of the response surface. For example it is suggested in  ADDIN EN.CITE <EndNote><Cite><Author>Schonlau</Author><Year>1997</Year><RecNum>8</RecNum><record><rec-number>8</rec-number><ref-type name="Conference Proceedings">10</ref-type><contributors><authors><author>M. Schonlau</author><author>Welch, W J.</author><author>Jones, D R.</author></authors></contributors><titles><title>A Data Analaytic Approach to Bayesian Global Optimization</title><secondary-title>American Statistical Association Proceedings, Section of Physical Engineering Sciences</secondary-title></titles><pages>186-191</pages><dates><year>1997</year></dates><urls></urls></record></Cite></EndNote>(Schonlau, Welch et al. 1997) to start with number of experiments equal to 10 times the number of active variables. This practice may lead to inefficiencies or may limit applications to the cases where real time experiments are not costly.

            In this work, we present a refinement of the MLE strategy that produces DACE response surfaces that can be better used to design experiments for optimization purposes.  The proposed refinement relies on use of informative priors about the model parameters.

As a motivating example, we revisit the example in  ADDIN EN.CITE <EndNote><Cite><Author>Sasena</Author><Year>2002</Year><RecNum>17</RecNum><record><rec-number>17</rec-number><ref-type name="Thesis">32</ref-type><contributors><authors><author>Sasena, M. J.</author></authors></contributors><titles><title>Flexibility and Efficiency Enhancements for Constrained Global Design Optimization with Kriging Approximations</title><secondary-title>Department of Mechanical Engineering</secondary-title></titles><volume>PhD Thesis</volume><dates><year>2002</year></dates><publisher>University of Michigan</publisher><urls></urls></record></Cite></EndNote>(Sasena 2002). The DACE model based on traditional MLE strategy is shown in Figure 1. Visual inspection shows that such an approximation is unacceptable. The model based on the new strategy is shown in Figure 2. Further suppose that this is a constraint function in a design problem and we are interested in the region where its value is greater than 9.0. To identify this region, we consider the probability that the predicted response is greater than 9.0. The estimate of this probability using the model constructed by traditional MLE strategy is shown in Figure 3. This is clearly far from reality and may lead to serious inefficiencies to the optimization strategy. The new strategy definitely shows an improved performance as can be seen from Figure 4.

We demonstrate the effectiveness of the proposed approach through a simulation study on a plasma etching reactor where we want to identify process conditions like pressure and reactor configuration, for example, location of induction coils that would result in an uniform etch rate in the presence of non-trivial constraints like etch rate must be greater than a specified minimum. Specifically, we consider here a case of Argon plasma with a single induction coil located on the sides. The configuration is illustrated in Figure 5. The design problem is to identify the optimal location of the coil and the pressure of the Argon plasma. The constraint is that the mean flux of the Argon ions at the wafer surface must be greater than

.The response surface for the non-uniformity and the probability that the constraints are satisfied based on dace model constructed using traditional MLE strategy are shown in Figure 6 and Figure 7. The dace models for these based on the modified MLE strategy are shown in Figure 8 and Figure 9.

 ADDIN EN.REFLIST Isaaks, E. H. and R. M. Srivastava (1989). An Introduction to Applied Geostatistics. New York, Oxford University Press.

Kushner, H. J. (1964). "A New Method of Locating the Maximum of an Arbitrary Multipeak Curve in the Presence of Noise." Journal of Basic Engineering 86: 97-106.

Lin, Y. (2004). An Efficient Robust Concept Exploration Method and Sequential Exploratory Experimental Design. Mechanical Engineering. Atlanta, Georgia Institute of Technology. Ph.D.: 780.

Myers, R. H., D. C. Montgomery, et al. (2004). " Response Surface Methodology: A retrospective and Literature Survey." Journal of Quality Technology 36(1): 53-70.

Pardo-Igu'zquiza, E. (1998). "Maximum Likelihood Estimation of Spatial Covariance Parameters." Mathematical Geology 30(1): 95-108.

Santner, T. J., B. J. Williams, et al. (2003). The Design and Analysis of Computer Experiments, Springer Series in Statistics.

Sasena, M. J. (2002). Flexibility and Efficiency Enhancements for Constrained Global Design Optimization with Kriging Approximations. Department of Mechanical Engineering, University of Michigan. PhD Thesis.

Schonlau, M., W. J. Welch, et al. (1997). A Data Analaytic Approach to Bayesian Global Optimization. American Statistical Association Proceedings, Section of Physical Engineering Sciences.



Figure  SEQ Figure \* ARABIC 1: The traditional MLE methodology results in a peak and valley behavior in the neighborhood of the design sites.


Figure  SEQ Figure \* ARABIC 2: The new strategy based on informative prior results in a superior approximation.




Figure  SEQ Figure \* ARABIC 3:. The graph above shows the probability that the response is greater than or equal to 9.0 using the model constructed by the traditional method. Clearly the probabilities estimated using the model do not represent the reality.





Figure  SEQ Figure \* ARABIC 4: The graph above shows the probability that the response is greater than or equal to 9.0 using the model constructed by the new MLE strategy.



Figure  SEQ Figure \* ARABIC 5: The reactor configuration used for the simulation. We assume axis symmetry so only one of the radius of the reactor is shown.


Figure  SEQ Figure \* ARABIC 6: The predicted non-uniformity from the dace model based on the traditional MLE.



Figure  SEQ Figure \* ARABIC 7 The estimate of the probability that the mean flux of Argon ions at the wafer surface is greater than 2.5x1020m-2 as a function pressure and coil location using the dace model for mean flux constructed by the traditional MLE strategy.


Figure  SEQ Figure \* ARABIC 8:- The predicted non-uniformity as a function of pressure and coil location using the dace model for non-uniformity constructed using the modified MLE strategy




Figure  SEQ Figure \* ARABIC 9:-The estimate of the probability that the mean flux of Argon ions at the wafer surface is greater than 2.5x1020m-2 as a function pressure and coil location using the dace model for mean flux constructed using the modified MLE strategy.