\documentclass[11pt,letterpaper]{article}
%\documentclass[Proceedings,11pt,letterpaper]{ascelike-mod}
%\documentclass[11pt]{article}
%\documentclass[12pt,letterpaper.proceedings]{article}
%\documentclass[12pt,letterpaper]{ascelike}
%\documentclass{ascelike}
%time of day as found in layout.tex
% ----------------------------------------------------------------------
%
% TIME OF DAY
%
\newcount\hh
\newcount\mm
\mm=\time
\hh=\time
\divide\hh by 60
\divide\mm by 60
\multiply\mm by 60
\mm=-\mm
\advance\mm by \time
\def\hhmm{\number\hh:\ifnum\mm<10{}0\fi\number\mm}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% % draft special command for postscript IZBACI ZA SLUZBENU VERZIJU
% %
% %
% \special{!userdict begin
% /bop-hook{
% gsave
% 40 50 translate %start position
% 90 rotate % orientation
% /Times-Roman findfont %font
% 10 scalefont % scaling of font
% setfont
% 0 0 moveto
% 0.85 setgray % gray level ( 1 -> white ; 0 black )
% (Jeremic, Sett, Kavvas: Draft Paper, part 1) % text you want to see
% show % or: true charpath for hollow letters
% %true charpath
% stroke grestore}def end}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%11pt%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% this is for 11pt
\setlength{\textwidth}{16.6cm}
\setlength{\textheight}{21.5cm}
\setlength{\hoffset}{-1.9cm}
\setlength{\voffset}{-2.2cm}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% % draft special command for postscript IZBACI ZA SLUZBENU VERZIJU
% %
% %
% \special{!userdict begin
% /bop-hook{
% gsave
% 50 10 translate %start position
% 90 rotate % orientation
% /Times-Roman findfont %font
% 60 scalefont % scaling of font
% setfont
% 0 0 moveto
% 0.80 setgray % gray level ( 1 -> white ; 0 black )
% (Jeremic: draft paper) % text you want to see
% show % or: true charpath for hollow letters
% %true charpath
% stroke grestore}def end}
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%JB
%%% from UCthesis.sty
%%\setcounter{topnumber}{4}
%%\def\topfraction{.8}
%%\setcounter{bottomnumber}{1}
%%\def\bottomfraction{.8}
%%\setcounter{totalnumber}{4}
%%\def\textfraction{.2}
%%\def\floatpagefraction{.8}
%%\setcounter{dbltopnumber}{2}
%%\def\dbltopfraction{.7}
%%\def\dblfloatpagefraction{.5}
%%%JB
%%
%%
%%
\usepackage{times}
\usepackage{graphics}
\usepackage{graphicx}
\usepackage{amsmath}
\usepackage{color}
\usepackage{ulem}
\usepackage{fancyheadings}
\usepackage[round,colon,sort,numbers]{natbib}
\renewcommand{\thefootnote}{\arabic{footnote}}
\usepackage{url}
\setlength{\headrulewidth}{0.0mm}
\setlength{\footrulewidth}{0.0mm}
\lhead{}
\rhead{}
\chead{}
\lfoot{}
\cfoot{}
\rfoot{\rm\thepage }
%\addtolength{\headheight}{14pt}
%\addtolength{\footheight}{14pt}
\pagestyle{fancy}
%\usepackage{subeqn}
%
% NOTE: Don't include the \NameTag{} if you have selected
% the NoPageNumbers option: this leads to an inconsistency.
%\NameTag{Kuhn}
%
\begin{document}
%\setlength{\textwidth}{6in}
%\setlength{\oddsidemargin}{.25in}
\thispagestyle{empty}
% You will need to make the title all-caps
\begin{center}
{\Large \bf Parallel Soil--Foundation--Structure Interaction Computations}
\end{center}
%
%\title{Probabilistic Elasto-Plasticity:
%The Role of Uncertainty in Material Parameters on
%Overal Stress--Strain Response}
%
\vspace*{0.5cm}
\begin{flushleft}
%%%%%%%%
Boris Jeremi{\'c}
\\
{\it
Department of Civil and Environmental Engineering, \\
University of California, \\
Davis, CA 95616, \\
\texttt{jeremic@ucdavis.edu}}
\\~\\
%%%%%%
Guanzhou Jie
\\
{\it Wachovia Corporation\\
375 Park Ave,\\
New York, NY},
\end{flushleft}
\vspace*{0.5cm}
\renewcommand{\baselinestretch}{1.2}
\small\normalsize % trick from Kopka and Daly p47.
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \setlength{\textwidth}{16.5cm}
% \setlength{\textheight}{23.0cm}
% \setlength{\hoffset}{-1.0cm}
% \setlength{\voffset}{-2.5cm}
% %
% %\def\baselinestretch{2}
% %
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%
% \newpage
% \tableofcontents
% \newpage
%
%
%\keywords{Parallel computing, Elasto--plasticity, Seismic
%soil--foundation--structure interaction.}
\paragraph{ABSTRACT:}
Described here is the Plastic Domain Decomposition (PDD)
Method for parallel elastic-plastic finite element computations related to
Soil--Foundation--Structure Interaction (SFSI) problems. The PDD provides for
efficient parallel elastic-plastic finite element computations steered by an
adaptable, run-time repartitioning of the finite element
domain. The adaptable repartitioning aims at balancing computational
load among processing nodes (CPUs), while minimizing
inter--processor communications and data redistribution during elasto-plastic
computations.
%
The PDD method is applied to large scale SFSI problem.
Presented examples show scalability and
performance of the PDD computations.
%
A set of illustrative example is used to show efficiency of PDD
computations and also to emphasize the importance of
coupling of the dynamic characteristics of earthquake, soil and structural (ESS)
on overall performance of the SFS system.
%
Above mentioned ESS coupling can only be investigated using detailed
models, which dictates use of parallel simulations.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{INTRODUCTION}
\label{Intro}
%
Parallel finite element computations have been developed for a number of years
mostly for elastic solids and structures. The static domain decomposition
(DD) methodology is currently used almost exclusively for decomposing such elastic
finite element domains in subdomains. This subdivision has two main purposes,
namely (a) to distribute element computations to CPUs in an even manner and (b)
to distribute system of equations evenly to CPUs for maximum efficiency in
solution process.
However, in the case of inelastic (elastic--plastic) computations. the static
DD is not the most efficient method since some subdomains become computationally
slow as the elastic--plastic zone propagates through the domain. This
propagation of the elastic--plastic zone (extent of which is not know a--priori)
will in turn slow down element level computations (constitutive level
iterations) significantly, depending on the complexity of the material model
used. Propagation of elastic--plastic zone will eventually result in some
subdomains becoming significantly computationally slow while others, that
are still mostly elastic, will be more computationally efficient. This
discrepancy in computational efficiency between different subdomains will result
in inefficient parallel performance. In other words, subdomains (and their
respective CPUs) with mostly elastic elements will be finishing their local
iterations much faster (and idle afterward) than subdomains (and their
respective CPUs) that have many elastic--plastic elements.
This computational imbalance motivated development of the Plastic
Domain Decomposition (PDD) method described in this paper.
%
Developed PDD is applied to a large scale seismic
soil--foundation--structure (SFS) interaction problem for bridge systems. It is
important to note that the detailed analysis of seismic SFSI described in this
paper is made possible with the development of PDD as the modeling requirements
(finite element mesh size) were such that sequential simulations were out of
questions.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Soil--Structure Interaction Motivation}
The main motivation for the development of PDD is the need for detailed analysis of
realistic, large scale SFSI models.
%
This motivation is emphasized by noting that currently, for a vast
majority of numerical simulations of seismic response of bridge
structures, the input excitations are defined either from a family of damped
response spectra or as one or more time histories of ground acceleration.
%
These input excitations are applied simultaneously along the base
of a structural system, usually without taking into account its dimensions and dynamic characteristics, the
properties of the soil material in foundations, or the nature of the ground
motions themselves.
%
Ground motions applied in such a way neglect the
soil--structure interaction (SSI) effects, that can significantly change
used free field ground motions.
%
A number of papers in recent years have investigated the influence of the SSI on
behavior of bridges \cite{Kappos2002, Makris1993, Makris1994, Sweet1993,
McCallen1994, Chen1977, Dendrou1985}.
%
% %
% % Recently, Tyapin~\cite{Tyapin2007} noted that after four decades of the intensive
% % studies of the SSI effects there still exists a gap between the SSI specialists
% % and practicing civil engineers. The results obtained using advanced SSI
% % simulations tools are often far from the results obtained using general
% % simulation tools that are available to practicing civil engineers, even if these
% % specialized SSI simulations tools match the experimental and field data far
% % better than general simulation tools.
% %
%
McCallen and Romstadt \cite{McCallen1994} performed a
remarkable full scale analysis of the soil--foundation--bridge system.
%
The soil material (cohesionless soil, sand) was modeled using
equivalent elastic approach (using Ramberg--Osgood material model through
standard modulus reduction and damping curves developed by Seed et al.
\cite{Seed1984}).
%
The two studies by Chen and Penzien \cite{Chen1977} and by Dendrou et al.
\cite{Dendrou1985} analyzed the bridge system including the soil,
however using coarse finite element meshes which might filter out certainly,
significant higher frequencies.
%
Jeremi{\'c} et al. \cite{Jeremic2003a} attempted a detailed, complete
bridge system analysis. However, due to computational limitations, the large
scale pile group had to be modeled separately and its stiffness used with the
bridge structural model. In present work, with the development of PDD
(described in some details), such computational limitations are removed and high
fidelity, detailed models of Earthquake--Soil--Structure systems can be
performed.
%
It is very important to note that proper modeling of seismic wave propagation
in elastic--plastic soils dictates the size of finite element mesh. This
requirement for proper seismic wave propagation will in turn results in a large
number of finite elements that need to be used.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Parallel Computing Background}
The idea of domain decomposition method can be found
in an original paper from 1870 by H.A. Schwarz (Rixena and
Magoul\`{e}s~\cite{Rixena07}.
%
Current state of the art in distributed computing in computational
mechanics can be
followed to early works on parallel simulation technology.
For example, early endeavors using inelastic finite elements focused
on structural problems within tightly coupled, shared memory
parallel architectures. We mention work by Noor et al.
\cite{Noor1978}, Utku et al. \cite{Utku1982} and Storaasil and
Bergan \cite{Storaasli1987} in which they used substructuring to
achieve distributed parallelism.
%
Fulton and Su \cite{Fulton1992} developed techniques to account for
different types of elements used in the same computer model
but used substructures made of same element types which
resulted in non--efficient use of compute resources.
%
Hajjar and Abel \cite{Hajjar1988} developed techniques for
dynamic analysis of framed structures with the objective of
minimizing communications for a high speed, local grid of
computer resource.
%
Klaas et al. \cite{Klaas1994} developed parallel computational techniques for
elastic--plastic problems but tied the algorithm to the specific multiprocessor
computers used (and specific network connectivity architecture)
thus rendering it less general and non-useful for other types of
grid arrangements.
%
Farhat \cite{Farhat1987} developed the so--called Greedy
domain partitioning algorithm, which proved to be quite
efficient on a number of parallel computer architectures available.
%
However, most of the above
approaches develop loss of efficiency when used on a heterogeneous
computational grid, which constitutive currently predominant parallel computer
architecture.
%
More recently Farhat et al. \cite{Farhat91a,Farhat91b,Farhat92} proposed
FETI (Finite Element Tearing and Interconnecting) method for
domain decomposition analysis. In FETI method, Lagrange multipliers
are introduced to enforce
compatibility at the interface nodes.
Although much work has been presented on domain decomposition methods,
the most popular methods such as FETI-type are based on subdomain interface
constraints handling.
%
It is also interesting to note promising efforts on merging of iterative solving
with domain decomposition-type preconditioning (Pavarino~\cite{Pavarino07} and
Li and Widlund~\cite{Li07}).
%Schwartz-type
%preconditioners for parallel domain decomposition system
%solving have also shared part of the spotlight (Hwang and Cai~\cite{Hwang07}
%and Sarkis and Szyld~\cite{Sarkis07}).
%
From the implementation point of view, for mesh-based
scientific computations, domain decomposition corresponds to
the problem of mapping a mesh onto a set of processors, which is
well defined as a graph partitioning problem (Schlegel et
al.~\cite{karypis:intro}).
%
Graph based approach for initial partitioning and subsequent repartitioning
forms a basis for PDD method.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Simulation Platform}
Developed parallel simulation program is developed using a a number of numerical libraries.
%
Graph partitioning and repartitioning is achieved using parts of the ParMETIS
libraries (Karypis et al.~\cite{Karypis98c}).
%
Parts of the OpenSees framework (McKenna~\cite{McKenna97}) were used to
connect
the finite element domain. In particular, Finite Element Model Classes from
OpenSees (namely, class abstractions Node, Element, Constraint, Load and Domain)
where used to describe the finite element model and to store the results of the
analysis
performed on the model. In addition to that, an existing Analysis
Classes were used as basis for development of parallel PDD framework which is then used
to drive the global level finite element analysis, i.e., to form and
solve the global system of equations in parallel.
%
Actor model \citep{Hewitt73,Agha84} was used and with addition of a Shadow, Chanel, MovableObject,
ObjectBroker, MachineBroker classes within the OpenSees framework
\citep{McKenna97} provided an excellent basis for our development.
%
%
On a lower level, a set of Template3Dep numerical libraries
(Jeremi{\'c} and Yang~\cite{Jeremic2000f})
were
used for constitutive level integrations, nDarray numerical libraries
(Jeremi{\'c} and Sture~\cite{Jeremic97d}) were used to handle vector, matrix
and tensor manipulations, while FEMtools element libraries from the UCD
CompGeoMech toolset (Jeremi{\'c}~\cite{Jeremic2004d}) were used to supply other
necessary components.
%
%
%
Parallel solution of the system of equations has been provided by PETSc set of
numerical libraries (Balay et al.~\cite{petsc-web-page, petsc-user-ref,
petsc-efficient}).
Most of the simulations were carried out on our local parallel computer GeoWulf.
Only the largest models (too big to fit in GeoWulf system) were simulated
on TeraGrid machine (at SDSC and TACC). It should be noted that program
sources described here are available through Author's web site.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%
\section{PLASTIC DOMAIN DECOMPOSITION METHOD}
\label{platform}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\subsection{Introduction}
Domain Decomposition (DD) approach is one of the most popular methods
that is used to implement and perform parallel finite element simulations. The underlying
idea is to divide the problem domain into subdomains
so that finite element calculations will be performed on each individual
subdomain in parallel. The DD can be overlapping or
non-overlapping. The overlapping domain decomposition method divides the
problem domain into several slightly overlapping subdomains.
Non-overlapping domain decomposition is extensively used in continuum finite
element modeling due to the relative ease to program and organize
computations and is the one that will be examined in this work.
%
In general, a good non-overlapping decomposition algorithm
should be able to (a) handle irregular mesh of arbitrarily shaped domain, and
(b) minimize the interface problem size by delivering minimum
boundary conductivities, which will help reducing the communication
overheads.
%
Elastic--plastic computations introduce a number of additional requirements for
parallel computing. Those requirements are described below.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{The Elastic--Plastic Parallel Finite Element Computational Problem}
The distinct feature of elastic-plastic finite
element computations is the presence of two iteration levels.
In a standard displacement based finite element implementation,
constitutive driver at each integration (Gauss) point iterates in stress and
internal variable space, computes the updated stress state,
constitutive stiffness tensor and delivers them to the finite
element functions. Finite element functions then use the
updated stresses and stiffness tensors to integrate new
(internal) nodal forces and element stiffness matrix. Then, on
global level, nonlinear equations are iterated on until
equilibrium between internal and external forces is satisfied
within some tolerance.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Elastic Computations.}
In the case of elastic computations, constitutive
driver has a simple task of computing increment in stresses
($\Delta \sigma_{ij}$) for a given deformation increment
($\Delta \epsilon_{kl}$), through a closed form equation
($\Delta \sigma_{ij} = E_{ijkl} \Delta \epsilon_{kl}$). It is
important to note that in this case the amount of work per
Gauss point is known in advance and is the same for every integration point. If we assume the
same number of integration points per element, it follows that
the amount of computational work is the {same} for each
element and it is {known in advance}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Elastic-Plastic Computations.}
For
elastic-plastic problems, for a given incremental deformation
the constitutive driver iterate in stress and internal
variable space until consistency condition is satisfied. Number of
implicit constitutive
iterations is
not known in advance. Similarly, if explicit constitutive computations are
done, the amount of work at each Gauss point is much higher that it was for
elastic step.
%
Initially, all Gauss points are in elastic state, but as the
incremental loads are applied, the elastic--plastic zones develops. For
integration
points still in elastic range, computational load is light. However, for
Gauss points that are elastic--plastic, the computational load increases
significantly (more so for implicit computations than for explicit ones).
%
This computational load increase depends on the complexity of material model.
For example, constitutive level integration
algorithms for soils, concrete, rocks, foams and other granular
materials can be very computationally demanding. More than 70\% of
wall clock time during an elastic-plastic finite element
analysis might be spent in constitutive level iterations. This is in
sharp contrast with elastic computations where the dominant
part is solving the system of equations which consumes about
90\% of run time. The extent of additional, constitutive level
computations is not known before the actual computations are
over. In other words, the extent of elastic-plastic domain is
not known ahead of time.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The traditional pre--processing type of Domain Decomposition
method (also known as topological DD) splits domain based on
the initial geometry and mesh connectivity and assigns roughly the same number of
elements to every computational node while minimizing the size of
subdomain boundaries. This approach might result in serious
computational load imbalance for elastic--plastic problems. For
example, one subdomain might be assigned all of the elastic--plastic
elements and spend large amount of time in constitutive level
computations. The other subdomain might have elements in elastic
state and thus spend far less computational time in computing
stress increments. This results in program having to wait for
the slowest subdomain (the one with large number of elastic-plastic
finite elements) to complete constitutive level iterations and
only proceed with global system iterations after that.
The two main challenges with computational load balancing for elastic--plastic
computations are that they need to be:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\item Adaptive, dynamically load balancing computations, as the extent of
elastic and elastic-plastic domains changes dynamically and unpredictably during
the course of the computation.
\item Multiphase computations, as elastic-plastic computations follow up the
elastic computations and there is a synchronization phase between
those two. The existence of the
synchronization step between the two phases of the computation
requires that each phase be individually load balanced.
\end{itemize}
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \subsubsection{Adaptive Computation}
% First, these computations are dynamic in nature. That is, the
% structure of elastic and elastic-plastic domains changes
% dynamically and
% unpredictably during the course of the computation. For this
% reason, a static decomposition computed as a pre-processing step
% is not sufficient to ensure the computational load-balance of
% the entire computation. Instead, periodic computational
% load-balancing is required during the course of the computation.
% The problem of computing a dynamic decomposition shares the same
% requirements as that of computing the initial decomposition
% (i.e., balance the mesh elements and minimize the
% inter-processor communications), while also requiring that the
% cost associated with redistributing the data in order to balance
% the computational load is minimized. This last requirement
% prevents us from simply computing a whole new static
% partitioning from scratch each time computational load-balancing
% is required.
%
% Often, the objective of minimizing the data redistribution cost
% is at odds with the objective of minimizing the inter-processor
% communications. For applications in which the computational
% requirements of different regions of the domain change rapidly,
% or the amount of state associated with each element is
% relatively high, minimizing the data redistribution cost is
% preferred over minimizing the communications incurred during
% parallel processing.
%
% For applications in which computational load-balancing occurs
% very infrequently, the key objective of a load-balancing
% algorithm is in obtaining the minimal inter-processor
% communications. For many application domains, it is
% straightforward to select a primary objective to minimize (i.e.,
% minimize whichever cost dominates). However, one of the key
% issues concerning the elastic-plastic computation is that the
% number of iterations between computational load-balancing phases
% is both unpredictable and dynamic. For example, in the case of
% static problems, zones in the 3D solid may become plastic and
% then unload to elastic (during increments of loading) so that
% the extent of plastic zone is changing. The change can be both
% slow and rapid. Slow change usually occurs during initial
% loading phases, while the later deformation tends to localize in
% narrow zones rapidly and the rest of the solid unloads rapidly
% (becomes elastic again). The narrow, localized zone has heavy
% computational load on the constitutive level (in each
% integration point within elements). Similar phenomena is
% observed in seismic soil-structure interaction computations
% where stiff structure interacts with soft soil and elastic and
% elasto-plastic zones change significantly during loading
% cycles. In this type of computation, it is extremely difficult
% to select the type of computational load-balancing algorithm to
% employ. Furthermore, the preferred computational load-balancing
% algorithm is liable to change during the course of the
% computation, and so the selection must be made dynamically.
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \subsubsection{Multiphase Computation}
% The second challenge
% associated with computational load-balancing elastic-plastic
% computations in geomechanics is that these are two-phase
% computations. That is, elastic-plastic computations follow up the
% elastic computations. There is a synchronization phase between
% the computations, as only after the elastic computation is
% finished is it possible to check if the elastic-plastic computation is
% required for a given integration (Gauss) point within an
% element. For regions of the mesh in which this check indicates
% that the elastic-plastic computation is necessary, lengthy
% elastic-plastic
% computations are then performed. The existence of the
% synchronization step between the two phases of the computation
% requires that each phase be individually load balanced. That is,
% it is not sufficient to simply sum up the relative times
% required for each phase and to compute a decomposition based on
% this sum. Doing so may lead to some processors having too much
% work during the elastic computation (and so, these may still be
% working after other processors are idle), and not enough work
% during the elastic-plastic computation, (and so these may be idle while
% other processors are still working), and vice versa. Instead, it
% is critical that every processor have an equal amount of work
% from both of the phases of the computation.
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \subsubsection{Multiconstraint Graph Partitioning}
% Elastic-plastic FE computation can be understood as a two-phase
% calculation, which is also dynamic in nature. Traditional graph
% partitioning formulations are not adequate to ensure its
% efficient execution on high performance parallel computers. In
% this work very recent progresses from the graph
% partitioning algorithm research will be investigated.
% We need new adaptive
% graph partitioning formulations, which can compute adaptive
% partitioning-repartitioning that can satisfy an arbitrary
% number of balance constraints.
% %
% %
% \begin{itemize}
%
% \item {Static Graph Partitioning}\\
% Given a weighted, undirected graph $G=(V,E)$, for which each
% vertex and edge has an associated weight, the $k$-way graph
% partitioning problem is to split the vertices of $V$ into $k$
% disjoint subsets (or {\em subdomains}) such that each subdomain
% has roughly an equal amount of vertex weight (referred to as
% the {\em balance constraint}), while minimizing the sum of the
% weights of the edges whose incident vertices belong to
% different subdomains (i.e., the edge-cut).
%
% \begin{enumerate}
%
% \item {Geometric Techniques}\\
% Compute partitioning based solely on the coordinate
% information of the mesh nodes, without considering edge-cut.
% Popular methods include, {\em Coordinate Nested Dissection}
% (CND or Recursive Coordinate Bisection), {\em Recursive
% Inertial Bisection} (RIB), {\em Space-Filling Curve} techniques
% and {\em Sphere-Cutting} approach.
%
% \item {Combinatorial Techniques}\\
% Attempt to group together highly connected vertices whether or
% not these are near each other in space. That is combinatorial
% partitioning schemes compute a partitioning based only on the
% adjacency information of the graph; they do not consider the
% coordinates of the vertices. They tend to have lower edge-cuts
% but generally slower. Popular methods include, {\em Levelized
% Nested Dissection} (LND) and {\em
% Kernighan-Lin/Fiduccia-Mattheyses} (KL/FM) partitioning
% refinement algorithm, which needs an initial partition input to
% do swapping refinement.
%
% \item {Multilevel Schemes}\\
% The multilevel paradigm consists of three phases: graph
% coarsening, initial partitioning, and multilevel refinement.
% Firstly, we form coarse graph by collapsing together selected
% vertices of the input graph. After rounds of coarsening, we get
% coarsest graph, on which an initial bisection will be
% performed. Then the KL/FM algorithm can be used to refine the
% partition back to the finest graph.
%
% The multilevel paradigm works well for two reasons. First, a
% good coarsening scheme can hide a large number of edges on the
% coarsest graph, which makes the task of computing high-quality
% partitioning easier. Second reason, incremental refinement
% schemes such as KL/FM become much more powerful in the
% multilevel context.
%
% Popular algorithms include {\em Multilevel Recursive Bisection}
% and {\em Multilevel $k$-Way Partitioning}.
%
% \end{enumerate}
%
% \item {Adaptive Graph Partitioning}\\
% For large scale elasto-plastic FE simulations, it is necessary
% to dynamically load-balance the computations as the analysis
% progresses due to unpredictable plastification inside the
% domain. This dynamic load balancing can be achieved by using a
% graph partitioning algorithm.
%
% Adaptive graph partitioning shares most of the requirements and
% characteristics of static graph partitioning but also adds an
% additional objective. That is, the amount of data that needs to
% be redistributed among the processors in order to balance the
% load should be minimized. If the vertex weight represents the
% computational cost of the work carried by the vertex, another
% metric, {\em size} of the vertex needs to be considered as
% well, which reflects distribution cost of the vertex. Thus, the
% repartitioner should attempt to balance the partitioning with
% respect to vertex weight while minimizing vertex migration with
% respect to vertex size.
%
% Different approaches are available. One can simply compute a
% new graph from scratch, so called {\em Scratch-Remap
% Repartitioner}, which, as expected introduces more data
% redistribution than necessary. {\em Diffusion-Based
% Repartitioner} attempt to minimize the difference between the
% original partitioning and the final repartitioning by making
% incremental changes in the partitioning to restore balance.
% This method has been a very active research topic in recent years,
% (Dongarra et al.~\cite{dongarra:sourcebook} gives up-to-date review).
%
% \item {Multiconstraint Graph Partitioning}\\
% We can see traditional graph partitioning typically balances
% only a single constraint (i.e., the vertex weight) and
% minimizes only a single objective (i.e., the edge-cut). If we
% replace the vertex weight, which is a single number, with a
% weight vector of size $m$, then the problem becomes that of
% finding a partitioning that minimizes the edge-cuts subject to
% the constraints that each of the $m$ weights is balanced across
% subdomains.
%
% Multilevel
% graph partitioning algorithms for solving
% multiconstraint/multiobjective problems have been very successful (Schloegel et
% al.~\cite{karypis:intro}).
% The software libraries METIS and ParMETIS are widely used in
% computational mechanics research.
% \end{itemize}
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{PDD Algorithm}
%
% One significant drawback of current implementation is that
% neither network communication nor model regeneration cost has
% been considered in element-graph-based type domain
% decomposition algorithm. Element graph only records
% computational load carried by each element. Only one $ITR$
% factor characterizes algorithmic approach of the load balancing
% operation and this is apparently too crude for complicated
% network/hardware configurations. The ignorance of the
% repartitioning-associated overheads inherent with application
% codes can lead to serious performance drop of the proposed PDD
% algorithm as shown in Figure~\ref{fig:9k_badresult}.
% \begin{figure}[!htbp]
% \begin{center}
% \includegraphics[width=0.8\textwidth]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/Verzija_Februar/Images/9K_Shallow_Abs_Speedup_Data_5}
% \caption{\label{fig:9k_badresult} Absolute Speedup Data of Parallel Runs on 9,297 Elements, 32,091 DOFs
% Model, ITR=1e-3, Imbal Tol 5\%}
% \end{center}
% \end{figure}
%
% This drawback can harm the overall performance of the whole
% application code more seriously when the simulation is to be
% run on heterogeneous networks, which means we can have
% different network connections and nodes with varied
% computational power. The dilemma is, without exact
% monitoring of network communication and local model
% regeneration costs, we can easily sacrifice the performance
% gain by load balancing operations.
%
% A second approach proposed in this work was the idea of
% modified {\em Globally Adaptive PDD} algorithm. The
% novelty comes from the fact that both data redistribution and
% analysis model regeneration costs will be monitored during
% execution. Load balancing will be triggered only when the
% performance gain necessarily offset the extra cost associated
% with the whole program. Domain graph structures will be kept intact
% till successful repartitioning happens. Meanwhile all elemental
% calculations will be timed to provide graph vertex weights.
% Data will be accumulated till algorithm restart happens, when
% all analysis model and vertex weights will be nullified.
%
% This improvement aims at handling network communication
% and any specific application-associated overheads automatically
% at the global level in order to remedy the
% drawback that the element graph repartitioning kernel currently
% supported by ParMETIS is not capable of directly reflecting
% this application level overheads. The new strategy is to
% automatically monitor network communication and local model
% regeneration timings which will be integrated to the entry of
% load balancing routines to act as additional triggers of the
% operation along with the load imbalance tolerance.
%
% Performance study shows that PDD algorithm with the new additions
% significantly improve performance even when the number of processing
% units is large. This modification fixes the drawback shown in
% previous sections that the performance of PDD was beaten by
% static domain decomposition when the number of processors
% increases.
%
%
%
% This strategy is called to be {\em globally adaptive} because
% both data communication and model regeneration costs are
% monitored at the application level, which tells best how the
% real application performs on all kinds of networks. Whatever
% the network/hardware configurations might be, real application
% runs always deliver the most accurate performance counters.
% This information can be applied on top of graph partitioning
% algorithm as a supplement to account for the drawback that the
% algorithm kernel is not capable of integrating global data
% communication costs.
%
The Plastic Domain Decomposition algorithm (PDD)
provides for
computational load balanced subdomains, minimizes subdomain boundaries and
minimizes the cost of data redistribution during dynamic load balancing.
%
The PDD optimization algorithm is based on dynamically monitoring both data
redistribution and analysis model regeneration costs during program execution in
addition to collecting information about the cost of constitutive level
iterations within each finite element.
%
A static domain decomposition is used to create initial partitioning, which is
used for the first load increment.
%
%Since, after that first load step, parts of the computational domain will (might)
%plastify, a computational load imbalance might occur.
%
Computational load (re--)balancing will (might) be triggered if, during
elastic--plastic computations in parallel, one computations on one of compute
nodes (one of subdomains) becomes slower
than the others compute nodes (other subdomains).
%
Algorithm for computational load balancing (CLB) is triggered if the performance
gain resulting from CLB offsets the extra cost associated with the
repartitioning.
%
The decision on whether to trigger repartitioning or not is based on an algorithm
described in some details below.
%Domain graph structures will be kept intact
%till successful repartitioning happens. Meanwhile all elemental
%calculations will be timed to provide graph vertex weights.
%Data will be accumulated till algorithm restart happens, when
%all analysis model and vertex weights will be nullified.
%This improvement aims at handling network communication
%and any specific application-associated overheads automatically
%at the global level in order to remedy the
%drawback that the element graph repartitioning kernel currently
%supported by ParMETIS is not capable of directly reflecting
%this application level overheads. The new strategy is to
%automatically monitor network communication and local model
%regeneration timings which will be integrated to the entry of
%load balancing routines to act as additional triggers of the
%operation along with the load imbalance tolerance.
%
%Performance study shows that PDD algorithm with the new additions
%significantly improve performance even when the number of processing
%units is large. This modification fixes the drawback shown in
%previous sections that the performance of PDD was beaten by
%static domain decomposition when the number of processors
%increases.
%
%
%This strategy is called to be {\em globally adaptive} because
%both data communication and model regeneration costs are
%monitored at the application level, which tells best how the
%real application performs on all kinds of networks. Whatever
%the network/hardware configurations might be, real application
%runs always deliver the most accurate performance counters.
%This information can be applied on top of graph partitioning
%algorithm as a supplement to account for the drawback that the
%algorithm kernel is not capable of integrating global data
%communication costs.
%
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\subsubsection{Implementations}
We define the global overhead associated with load
balancing operation to consist of two parts, data communication cost
$T_{comm}$ and finite element model regeneration cost $T_{regen}$,
%
\begin{equation}
T_{overhead} := T_{comm} + T_{regen}
\end{equation}
%
Performance counters are setup within the program to follow both.
%
Data communication patterns
characterizing the network configuration can be readily measured ({$T_{comm}$})
as the program runs the initial partitioning. Initial (static) domain decomposition
is performed for the first load step.
The cost of networking is inherently changing as the
network condition might vary as simulation progresses, so
whenever data redistribution happens, the metric is automatically updated
to reflect the most current network conditions.
%
Model regeneration cost ($T_{regen}$ ) is a result of a need to
regenerate the analysis model whenever elements (and nodes) are moved between computational
nodes (CPUs). It is important to
note that model (re--) generation also happens initially when the fist
data distribution is done (from the static DD). Such initial DD
phase provides excellent initial estimate of the
model regeneration cost on any specific hardware configurations.
%
This ability to measure realistic compute costs allows developed algorithm (PDD)
to be used on multiple generation parallel computers.
%
For the load balancing operations to be effective,
the $T_{overhead}$ has to be offset by the performance gain
$T_{gain}$.
%
%
Finite element mesh for the given model is represented by a graph, where
each finite element is represented by a graph vertex.
%
The computational load imposed by
each finite element (FE) is represented by the associated vertex weight
$vwgt[i]$.
If the summation $SUM$ operation is applied on every single
processing node, the exact
computational distribution among processors can be obtained as
total wall clock time for each CPU
%
\begin{equation}
\label{eq:Tsum}
T_j := \sum_{i=1}^{n}vwgt[i],\, j=1,2,\ldots,np
\end{equation}
%
where $n$ is the number of elements on each processing
domain and $np$ is the number of CPUs.
%
The wall clock time is controlled by $T_{max}$, defined as
%
\begin{equation}
T_{sum} := sum(T_j),\, T_{max} := max(T_j),\, \mbox{and}\;
T_{min} := min(T_j),\, j=1,2,\ldots,np
\end{equation}
%
Minimizing $T_{max}$ becomes here the main objective. Computational load balancing
operations comprises delivering evenly
distributed computational loads among processors. Theoretically, the
best execution time is,
%
\begin{equation}
T_{best} := T_{sum}/np, \, \mbox{and}\; T_j \equiv T_{best}, \,
j=1,2,\ldots,np
\end{equation}
%
if the perfect load balance is to be achieved.
Based on definitions above, the best performance gain
$T_{gain}$ that can be obtained from computational load balancing operation as,
%
\begin{equation}
T_{gain} := T_{max} - T_{best}
\end{equation}
%
Finally, the load balancing operation will be beneficial iff
%
\begin{equation}
T_{gain} \ge T_{overhead} = T_{comm} + T_{regen}
\end{equation}
%
Previous equation is used in deciding if the re--partitioning is triggered in
current incremental step. It is important to note that PDD will always
outperform static DD as static DD represents the first decomposition of the
computational domain. If such decomposition becomes computationally unbalanced
and efficiency can be gained by repartitioning, PDD be triggered and the
$T_{max}$ will be minimized.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Solution of the System of Equations}
A number of different algorithms and implementations
exists for solving unsymmetric\footnote{Non-associated elasto--plasticity
results in a non-symmetric stiffness tensors, which result in non--symmetric
system of finite element equations. Non--symmetry can also result from the use of
consistent stiffness operators as described by Jeremi{\'c} and Sture
~\cite{Jeremic96b}. } systems of equations in parallel.
In presented work, use was made of both iterative and direct solvers as
available through the PETSc
interface (Balay et al.~\cite{petsc-web-page, petsc-user-ref, petsc-efficient}).
Direct solvers, including MUMPS,
SPOOLES, SuperLU, PLAPACK have been tested and used for performance
assessment. In addition to that, iterative solvers, including GMRES, as well as
preconditioning techniques (Jacobi, inconsistency LU decomposition and
approximate inverse preconditioners) for Krylov methods have been also used and
their performance assessed.
Some of the conclusions drawn are that, for our specific finite element models
(non--symmetric, using penalty method for some connections, possible softening
behavior), direct
solvers outperform the iterative solver significantly.
As expected direct solver were not as scalable as iterative solvers,
however, specifics of our finite element models (dealing with
soil--structure interaction) resulted in poor initial
performance of iterative solvers, that, even with excellent performance scaling, could not
catch up with the efficiency of direct solvers.
%
IT is also important to note that parallel direct solvers, such as
MUMPS and SPOOLES provided the best performance and would be recommended for use
with finite element models that, as ours did, feature non--symmetry, are poorly
conditioned (they are ill--posed due to use of penalty method) and can be
negative definite (for softening materials).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{PDD Scalability Study}
The developed PDD method was tested on a number of static and dynamic
examples. Presented here are scalability (speed--up) results for a series of
soil--foundation--structure model runs. A hierarchy of models, described
later in section \ref{model}, was used in scaling study for dynamic runs.
%
Finite element models were subjected to recorded earthquakes (two of
them, described in section \ref{results}), using
DRM for seismic load application (Bielak et al.~\cite{Bielak2001} and
Yoshimura et al~\cite{Yoshimura2001}).
Total wall clock time has been recorded and used to analyze
the parallel scalability of PDD, presented in
Figure~\ref{fig:3bent_speedup}.
%
\begin{figure}[!htbp]
\begin{center}
%\includegraphics[width=0.6\textwidth]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/Verzija_Februar/Images/TransientSpeedup.pdf}
\includegraphics[width=0.6\textwidth]{TransientSpeedup.pdf}
\caption{\label{fig:3bent_speedup} Scalability Study on 3 Bent
SFSI Models, DRM Earthquake Loading, Transient Analysis, ITR=1e-3, Imbal
Tol 5\%.}
\end{center}
\end{figure}
There is a number of interesting observations about the performance scaling
results shown
in Figure \ref{fig:3bent_speedup}:
%
\begin{itemize}
\item the scalability is quite linear for small number of DOFs (elements),
\item there is a link, relation between number of DOFs (elements) and number of
CPUs which governs the parallel efficiency. In other words, there exists certain ratio
of the number of DOFs to number of CPUs after which the communication overhead
starts to be significant. For example for
a models with 484,104 DOFs in Figure \ref{fig:3bent_speedup}, the
computations with 256 CPUs are more efficient that those with 512 CPUs. This
means that for the same number of DOFs (elements) doubling the number of CPUs
does not help, rather it is detrimental as there is far more communication
between CPUs which destroys the efficiency of the parallel computation.
Similar trend is observable for the large model with 1,655,559 DOFs, where
1024 CPUs will still help (barely) increase the parallel efficiency.
\item Another interesting observation has to do with the relative
computational balance of local, element level computations (local equilibrium
iterations) and the system of equations solver.
PDD scales very nicely as its main efficiency objective is to have equal
computational load for element level computations. However, the
efficiency of the system of equations solver becomes more prominent when
element level computations are less prominent (if they have been significantly
optimized with a large efficiency gain).
For example, for the model with 56,481 DOFs it is observed that for sequential
case (1 CPU), elemental
computation amount for approx. 70~\% of wall clock time. For the same model, in
parallel with 8 CPUs,
element level computation accounts for approx. 40\% of wall clock time, while
for 32 CPUs the element level computation account for only 10\% of total wall clock time.
In other words, as the number of CPUs increase, the element level computations
are becoming very efficient and the main efficiency gain can then be made with
the system of equations solver. However, it is important to note that parallel
direct solver are not scalable up to large
number of CPUs (Demmel et al.~\cite{superlu99}) while parallel iterative solver
are much more scalable
but difficult to guarantee convergence. This observation can be used in fine
tuning of parallel computing efficiency, even if it clearly points to a number
of possible problems.
\end{itemize}
% % \item There are some parameters that can be calibrated in the
% % current implementation. As indicated by results of thorough
% % numerical tests, ITR=0.001 and load imbalance tolerance
% % ubvec=1.05 (5\%) should be adopted and studies on our
% % application in this work have shown they are adequate
% % and able to bring performance not worse than the commonly used
% % domain decomposition method in parallel finite element analysis.
% % \item For the parameters suggested in the work, we
% % can see a general trend that the efficiency of PDD will drop as
% % the number of processors increases.
% % This can be explained. The
% % implication of increasing processing units is that the
% % subdomain problem size will decrease. It is naturally evident that
% % the repartition load balancing won't be able to recover the
% % overhead by balancing off small size local calculations. The
% % improved design of globally adaptive PDD algorithm has been
% % implemented in this work and both data communication
% % and model regeneration costs associated with graph
% % repartitioning have been integrated into the new globally
% % adaptive strategy. With the new design, it has also been shown that
% % the PDD algorithm consistently outperforms classic one step
% % domain decomposition algorithm and better scalability can be
%
% obtained as shown in Figure~\ref{fig:summary}. It has been
% shown that even for
% large number of processors, the current implementation can
% always guarantee
% that the performance of PDD is not worse than static DD method
% as shown in Figure~\ref{fig:3dbar}.
% (the repartition routine has less than 5\% overhead of the
% total wall clock time).
% %%\begin{landscape}
% %\begin{figure}[!htbp]
% %\begin{center}
% %\includegraphics[width=0.8\textwidth]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/Verzija_Februar/Images/PDDSpeedup.pdf}
% %\caption{\label{fig:summary}
% % Scalability of PDD, Static
% %Loading, Shallow Foundation Model, ITR=1e-3, Imbal Tol 5\%}
% %\end{center}
% %\end{figure}
% %%\end{landscape}
% %%\begin{landscape}
% %\begin{figure}[!htbp]
% %\begin{center}
% %\includegraphics[width=0.8\textwidth]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/Verzija_Februar/Images/PDDSpeedUp3DBar.pdf}
% %\caption{\label{fig:3dbar} Relative Speedup of PDD over DD,
% %Static Loading, Shallow Foundation Model, ITR=1e-3, Imbal Tol
% %5\%}
% %\end{center}
% %\end{figure}
% %%\end{landscape}
% %\clearpage
% %\item
% If the problem size is fixed, there exists an optimum
% number of processors that can bring the best performance of the
% proposed load balancing algorithm. As the number of processing
% units increases after this number, the efficiency of proposed
% algorithm drops, which is understandable because the local load
% imbalance is so small overall that balancing gain won't
% offset the extra cost associated with repartitioning.
% But still the bottom line of proposed adaptive PDD
% algorithm is that it can run as fast as static one-step domain
% decomposition approach with less than 5\% overhead of
% repartitioning routine calls. On the other hand, if the number
% of processing units is fixed, bigger finite
% element model will exhibit better performance.
% The conclusion is shown clearly in 3D in Figure~\ref{fig:3dbar}.
% %\item
%
% % It is also worthwhile to point out that even without comparing
% % with classical DD, PDD itself exhibits deteriorating performance as
% % the number of processing units increases. Here the reproduction of
% % Figure~\ref{fig:summary} is presented with some downside performance
% % noted as shown in Figure~\ref{fig:summary_full}.
% % %\begin{landscape}
% % \begin{figure}[!htbp]
% % \begin{center}
% % \includegraphics[width=0.8\textwidth]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/Verzija_Februar/Images/PDD_AbsoluteSpeedup_full.pdf}
% % \caption{\label{fig:summary_full}
% % Full Range Scalability of PDD, Static
% % Loading, Shallow Foundation Model, ITR=1e-3, Imbal Tol 5\%, Performance
% % Downgrade Due to Increasing Network Overhead}
% % \end{center}
% % \end{figure}
% % %\end{landscape}
%
% The implication is explained as follows:
% \begin{itemize}
% \item The performance drop partly is due to the communication overhead
% gets bigger and bigger so parallel processing will not be able to offset
% the communication loss.
% \item It is also noted that as the number of processing units increases,
% the elemental level calculation drops very scalably with the number of
% CPUs. This is inherently advantage of the proposed PDD algorithm. PDD
% through domain decomposition is very scalable for local level
% calculations because inherently local comp is element-based. when
% elements are distributed, loads are spread out evenly (during initial and
% redistribution). So as the number of CPU increases, the equation solving
% becomes more expensive.
%
% For the case of 56,481 DOFs prototype model with DRM earthquake loading,
% it has been observed that for sequential case (1 CPU), elemental
% computation takes 70\% of time as shown in Table~\ref{tab:eptiming}. As for
% parallel case (8 CPUs), we optimized parallel elemental
% computations through PDD, elemental computation only accounts for about 40\%.
% As the number of CPU increases, parallel case (32 CPUs), the local level computation
% will only take less than 10\% of total wall clock time.
%
% In other words, as the number of CPUs increases, PDD loses scalability because of
% the equation solving now dominates. As being discussed in Chapter~\ref{ch:solver},
% the parallel direct solver itself is not scalable up to large
% number of CPUs (Demmel et al.~\cite{superlu99}). Parallel iterative solver is much more scalable
% but difficult to guarantee convergence. This is now also the most important topic
% in the whole scientific computing community.
% \end{itemize}
% %\end{itemize}
%
%
% %For one set of fixed algorithm parameters,
% %such as ITR and load imbalance tolerance, basic conclusion is
% %there exists an {\em optimal number} of processors that can
% %bring best performance and as finite element model size
% %increases, this number increases as listed in
% %Table~\ref{tab:optimal}.
% %\begin{table}[!htbp]
% %\begin{center}
% %\caption{\label{tab:optimal} Best Performance Observed for
% %ITR=0.001, Loab Imbalance Tolerance \%5}
% %\begin{scriptsizetabular}{|r|r|r|}\hline
% %{\bf \# of DOFs} & {\bf Speedup} & {\bf \# of CPUs}\\\hline
% %4,035 & 1.553 & 4 \\\hline
% %17,604 & 1.992 & 7 \\\hline
% %32,091 & 1.334 & 7 \\\hline
% %68,451 & 1.068 & 16 \\\hline
% %\end{scriptsizetabular}
% %\end{center}
% %\end{table}
%
% %The second point is related to the implementation of the
% %multilevel graph partitioning algorithm. In current
% %implementation of ParMETIS used in this work, vertex
% %weight can only be specified as an {\em int}. That means in
% %order to get timing data from local level calculation for each
% %element, {\em double} data returned by MPI timing routine has
% %to be converted to {\em int}. Significant digit loss can happen
% %depending on what accuracy the system clock can carry. We
% %can also adjust the vertex weight by amplifying the timing by
% %scale factors in order to save effective digits. 10 milisecond
% %has been used in this work to represent the effective
% %timing digits when converting from {\em double} to {\em int}.
% %
% %It is also noted that the current version of parallel OpenSees
% %can not handle huge model even with the most advanced super
% %computers national wide. One obvious drawback is that the
% %OpenSees needs to read all model data into one single master
% %process which orchestrates subsequent partitioning and
% %analysis. For huge model with millions of DOFs, we can never
% %expect one single process is capable of loading all model data
% %on its own. The restructuring will require parallel data
% %read-in and model setup which basically conflicts with the
% %foundation of current parallel design. This can be extended to
% %a future work topic.
% %
% %\item As we can also see from previous sections, the parallel
% %performance of PDD is not very scalable for a fixed set of
% %parameters and this can be explained in details.
% %\begin{enumerate}
% %\item The first point is
% %at the parallel algorithm level. The research finished so far
% %in the work mainly aims at performance gains by
% %adaptive graph partitioning/repartitioning over the
% %old design that only performs one-step static partitioning.
% %Aside from data communication overhead, analysis model
% %regeneration after adaptive repartitioning has been
% %overwhelming as the model size increases. This extra overhead
% %can not be accounted in element-based graph partitioning algorithms.
% %Parametric tuning on manually setting up vertices' weight can
% %improve performance.
% %\end{enumerate}
% %
% %out \appendix
% %out \include{b_eff}
% %out
% %out \bibliography{ref}
% %out \bibliographystyle{unsrt}
% %out
% %out \end{document}
%
%
%
%
%
%
%
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{FINITE ELEMENT SFSI MODEL DEVELOPMENT}
\label{model}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The finite element models used in this study have combined both
elastic--plastic solid
elements, used for soils, and elastic and elastic--plastic structural elements, used for concrete
piles, piers, beams and superstructure. In this section described are
material and finite element models used for both soil and structural components.
In addition to that, described is the methodology used for seismic force
application and staged construction of the model,
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Soil and Structural Model}
\label{soilmodel}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Soil Models.}
Two types of soil were used in modeling. First type of soil was based on
stiff, sandy soil, with limited calibration data (Kurtulus et
al.~\citep{texas:site}) available from capitol
Aggregates site (south--east of Austin, Texas).
%
Based on the stress-strain curve obtained from a triaxial
test, a nonlinear elastic-plastic soil
model has been developed using Template Elastic plastic framework
\citep{Jeremic2000f}.
%
Developed model consists of a Drucker-Prager yield surface,
Drucker--Prager plastic flow directions (potential surface) and a nonlinear Armstrong-Frederick
(rotational) kinematic hardening rule \citep{Armstrong66}.
%
Initial opening of a Drucker--Prager cone was set at $5^{o}$
only while the actual deviatoric hardening is produced
using Armstrong--Frederick nonlinear kinematic hardening with hardening
constants $a=116.0$ and $b=80.0$.
Second type of soil used in modeling was soft clay (Bay Mud). This type of soil was
modeled using a total stress approach with an elastic perfectly plastic von
Mises yield surface and plastic potential function. The shear strength for such
Bay Mud material was set at $C_u =5.0$~kPa. Since this soil is
treated as fully saturated and there is not enough time during shaking for any
dissipation to occur, the elastic--perfectly plastic model provides enough
modeling accuracy.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Soil Element Size Determination.}
The accuracy of a numerical simulation of seismic wave propagation
in a dynamic SFSI problem is mainly controlled the spacing of the nodes of the
finite element model.
%
In order to represent a
traveling wave of a given frequency accurately, about 10 nodes
per wavelength are required.
%
In order to determine the appropriate maximum grid
spacing the highest relevant frequency $f_{max}$ that is to be simulated in
the model needs to be decided upon.
%
Typically, for seismic analysis one can assume $f_{max}= 10Hz$.
By choosing the wavelength $\lambda_{min} = v/f_{max}$,
where $v$ is the (shear) wave velocity, to be represented by 10 nodes
the smallest wavelength that can still be captured with any confidence is
$\lambda = 2\Delta h$, corresponding to a frequency of $5f_{max}$.
%
The maximum grid spacing $\Delta h$ should therefore not be larger than
%
\begin{equation}
\label{eq:gridsize}
\Delta h \le \frac{\lambda_{min}}{10} = \frac{v}{10f_{max}}
\end{equation}
%
where $v$ is the smallest wave velocity that is of interest in the
simulation (usually the wave velocity of the softest soil layer).
%
In addition to that, mechanical properties of soil will change with (cyclic)
loadings as plastification develops.
%
Moduli reduction curve ($G/G_{max}$) can then used to determine soil element
size while taking into account soil stiffness degradation (plastification).
%
Using shear wave velocity relation to shear modulus
%
\begin{equation}
\label{eq:g_v}
v_{shear} = \sqrt{\frac{G}{\rho}}
\end{equation}
%
one can readily obtain the dynamic degradation of wave velocities.
This leads to smaller element size required for detailed simulation of wave
propagation in soils which have stiffness degradation (plastification).
%
% The addition of
% stiffness degradation effects (plastification) of soil on soil finite size are listed
% in Table~\ref{tab:new_size}.
% %
% \begin{table}
% \begin{center}
% \caption{\label{tab:new_size} Soil finite element size
% determination with shear wave velocity and stiffness degradation effects for
% assumed seismic wave with $f_{max}=$10~HZ, (minimal value of $G/G_{max}$
% corresponding to $0.2\%$ strain level.)}
% \begin{tabular}{|r|r|r|r|r|r|}\hline
% Depth ($ft$) & Layer thick. ($ft$) & $v_s$ ($fps$) &
% $G/G_{max}$ & $v_s^{min}$(fps) & $\Delta h_{max}$ ($ft$) \\\hline\hline
% 0 & 1 & 320 & 0.36 & 192 & 1.92 \\\hline
% 1 & 1.5 & 420 & 0.36 & 252 & 2.52 \\\hline
% 2.5 & 4.5 & 540 & 0.36 & 324 & 3.24 \\\hline
% 7 & 7 & 660 & 0.36 & 396 & 3.96 \\\hline
% 14 & 7.5 & 700 & 0.36 & 420 & 4.20 \\\hline
% 21.5 & 17 & 750 & 0.36 & 450 & 4.50 \\\hline
% 38.5 & half-space &2200 & 0.36 &1320 & 13.20\\\hline
% \end{tabular}
% \begin{minipage}[b]{\textwidth}
% \addtocounter{footnote}{-1}
% \end{minipage}
% \end{center}
% \end{table}
% %
Table \ref{modeli} presents an
overview of model size (number of elements and element size) as function of
cutoff frequencies represented in the model, material (soil) stiffness (given in
terms of $G/G_{max}$) and amount of shear deformation for given stiffness
reduction.
%
\begin{table}[!htbp]
%\caption{ }
\caption{\label{modeli} Variation in model size (number of elements
and element size) as
function of frequency, stiffness and shear deformation.}
\begin{center}
\begin{tabular}{|c|c|c|c|r|}
\hline
model size (\# of elements) & element size & $f_{cutoff}$ & min. $G/Gmax$
& shear def. $\gamma$ \\
\hline
%
%
% 100cm element model, f_cuttof=10HZ, G/Gmax=1.0, 12K elements, Gmax model
12K & 1.0~m & 10~Hz & 1.0 & $<$0.5~\% \\
%
%
%~90cm element model, f_cutoff ~= 3Hz, G/Gmax~=0.08, for epsilon ~= 1%, 15K
%elements, now running...
15K & 0.9~m & 3~Hz & 0.08 & 1.0~\% \\
%
%
%27.6cm element model, f_cutoff= 10HZ, G/Gmax=0.08, for epsilon = 1%, 150K
%elements, finest model
150K & 0.3~m & 10~Hz & 0.08 & 1.0~\% \\
500K & 0.15~m & 10~Hz & 0.02 & 5.0~\% \\
% 27.6cm element model, f_cutoff= 10HZ, G/Gmax=0.08, for epsilon = 1%, 150K
%elements, finest model
\hline
\end{tabular}
\end{center}
\end{table}
%
It is important to note that 3D nonlinear elastic--plastic finite element
simulations were performed, while stiffness reduction curves were used for
calibration of the material model and for determining (minimal) finite element
size.
%
It should also be noted that the largest FEM model had over 0.5 million elements
and over 1.6 million DOFs. However, most of simulations were performed with
smaller model (with 150K elements) as it represented mechanics of the problem
with appropriate level of accuracy.
%
Working FEM model mesh is shown in Figure \ref{BridgeSFSI01}.
%
\begin{figure}[!htbp]
\begin{center}
%\includegraphics[width=8cm]{/home/jeremic/tex/works/Reports/2006/NEESDemoProject/PrototypeMesh.jpg}
\includegraphics[width=8cm]{PrototypeMesh.jpg}
%\vspace*{-0.5cm}
\caption{\label{BridgeSFSI01}
Detailed Three Bent Prototype SFSI Finite Element Model, 484,104 DOFs, 151,264
Elements used for most simulation in this study.}
\end{center}
\end{figure}
%
%
%
%
%
%
%
%
%
%
%
% Based on above soil finite element size determination, a three bent prototype
% finite element model has been developed and is shown in Figure~\ref{fig:prototype}.
% %
% \begin{figure}[!htbp]
% \begin{center}
% %\includegraphics[width=1.0\textwidth]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/Verzija_Februar/Images/PrototypeMesh.eps}
% \includegraphics[width=0.65\textwidth]{/home/jeremic/tex/works/Papers/2007/SFSI_in_nonuniform_soils/figs/PrototypeMesh.pdf}
% \caption{\label{fig:prototype}
% Detailed Three Bent Prototype SFSI Finite Element Model, 484,104 DOFs, 151,264
% Elements.}
% \end{center}
% \end{figure}
% %
The model used, features 484,104 DOFs, 151,264 soil and beam--column elements,
and is intended to model appropriately seismic waves of up to 10Hz, for minimal
stiffness degradation of $G/G_{max} = 0.08$, maximum shear strain of $\gamma =
1$\% and with the maximal element size $\Delta h = 0.3$~m.
%
The largest model (1.6 million DOFs) was able to capture
10~Hz motions, for $G/G_{max} = 0.02$, and maximum shear strain of $\gamma =
5$\% (see Table~\ref{modeli}).
%
%
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \paragraph{Time Step Length Requirement}
%
%
% The time step $\Delta t$
% used for numerically solving nonlinear
% vibration or wave propagation problems has to be limited for
% two reasons \citep{local-87}. The stability requirement depends on the time
% integration scheme in use and it restricts the size of $\Delta t=T_n/10$.
% Here, $T_n$ denotes the smallest fundamental period of the
% system. Similar to the spatial discretization, $T_n$ needs to be
% represented with about 10 time steps. While the accuracy
% requirement provides a measure on which higher modes of
% vibration are represented with sufficient accuracy, the
% stability criterion needs to be satisfied for all modes. If the
% stability criterion is not satisfied for all modes of vibration,
% then the solution may diverge. In many cases it is necessary to
% provide an upper bound to the frequencies that are present in a
% system by including frequency dependent damping to the time integration scheme.
%
%
% The second stability criterion results from the nature of
% finite element method. As a wave front progresses in space, it
% reaches one point (node) after the other. If the time step in the
% finite element analysis is too large, than the wave front can reach
% two consecutive points (nodes) at the same time. This would violate
% a fundamental property of wave propagation and can lead to
% instability. The time step therefore needs to be limited to
% %
% \begin{equation}
% \Delta t < \frac{\Delta h}{v}
% \end{equation}
% %
% where $v$ is the highest wave velocity.
% %
% Based on values determined in Table~\ref{tab:new_size}, the time step requirement
% can be written as
% %
% \begin{equation}
% \Delta t < \frac{\Delta h}{v} = 0.00256 s
% \end{equation}
% %
% thus limiting the effective time step size used in numerical simulations of this
% particular soil--structure model.
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%
%
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \subsection{Structural Model}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Structural Models.}
The nonlinear structure model (the piers and the superstructure) used in this study were
initially developed by Fenves and Dryden~\cite{Dryden2005}.
%
% Model calibration was performed using experimental data from UNR shaking table tests
% \citep{Dryden2006}.
% %
% The original model was developed with fully fixed conditions at the base of
% piers. This choice of boundary conditions influences location of possible
% plastic hinges in piers.
% %
% This is important as the model predetermines location of plastic hinges by
% placing zero--length hinges at the bottom and top of the piers.
% %
% In reality, the piers extend into piles and the possible
% plastic hinges might form below ground surface (as well as at the top of piers)
% and not at the bottom of piers.
%
This original structural (only) model was subsequently updated to include
piles and surrounding soil, and zero length elements (modeling concentrated
plastic hinges) were removed from the
bottom of piers at the connection to piles.
% %
% The zero--length elements were removed from the bottom of the pier.
% %
% In addition to that, the beam elements used for piles were modeled using nonlinear fiber beam
% element which allows for development of (distributed) plastic hinges.
% %
%
%
%
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \subsection{Material Modeling}
%
%
%\paragraph{Concrete Modeling.}
Concrete material was modeled using Concrete01 uniaxial
material as available in OpenSees framework (Fenves and
Dryden~\cite{Dryden2005, Dryden2006}.
% Basic description is provided here for
% completeness.
%
% This model model is based on work by
% \citet{kent1971} and includes linear unloading/reloading stiffness that degrades
% with increasing strain.
% %
% The confined concrete compressive strength, $f_{cc}^{\prime}$, has been calculated
% based on work by \citet{Mander1988}. The ultimate strain for
% the confined concrete was determined using the energy balance approach developed
% by \citet{Mander1988}.
%
Material model parameters used for unconfined concrete
in the simulation models were $f_{co}^{\prime}=5.9$~ksi,
$\epsilon_{co}=0.002$, $f_{cu}^{\prime}=0.0$~ksi, and $\epsilon_{cu}=0.006$.
Material parameters for confined concrete used were
$f_{co}^{\prime}=7.5$~ksi, $\epsilon_{co}=0.0048$, $f_{cu}^{\prime}=4.8$~ksi, and
$\epsilon_{cu}=0.022$ .
%
% % The response of these models under cyclic loading is
% % shown in Figure~\ref{fig:F4.1}.
% %
% % %\begin{table}[!htbp]
% % %\begin{center}
% % %\caption{\label{T4.1} Concrete material properties \cite{Dryden2006}}
% % %\begin{tabular}{|c|c|c|c|c|}\hline
% % %Concrete Type & $f_{co}^{\prime}$ ($ksi$) & $\epsilon_{co}$ & $f_{cu}^{\prime}$($ksi$) & $\epsilon_{cu}$ \\\hline
% % %Unconfined & $5.9$ & 0.002 & 0.0 & 0.006 \\\hline
% % %Confined & 7.5 & 0.0048 & 4.8 & 0.022 \\\hline
% % %\end{tabular}
% % %\end{center}
% % %\end{table}
% % \begin{figure}[!htbp]
% % \begin{center}
% % \includegraphics[width=0.6\textwidth]{figs/f4-1.eps}
% % \caption{\label{fig:F4.1} Cyclic response of concrete material model \citep{Dryden2006}}
% % \end{center}
% % \end{figure}
% %
% %
%
% \paragraph{Steel Modeling.}
Hysteretic uniaxial material model available within OpenSees framework was
selected to model the response of the steel reinforcement.
The parameters included in this
model are $F_1=67$~ksi, $\epsilon_1=0.0023$, $F_2=92$~ksi, $\epsilon_2=0.028$,
$F_3=97$~ksi, and $\epsilon_3=0.12$. No allowance for pinching or damage under
cyclic loading has been made ($pinchX=pinchY=1.0$, $damage1=damage2=0.0$,
$beta=0$).
The finite element model for piers and piles features a nonlinear fiber
beam--column element
Spacone et al.~\cite{Spacone1996}. In addition to that, a zero-length elements is introduced
at the top of the piers in order to capture the effect of the rigid body
rotation at the joints due to elongation of the anchored reinforcement.
%
Cross section of both piers and piles was discretized
using $4 \times 16$ subdivisions of core
and $2 \times 16$ subdivisions of cover for radial and tangential direction
respectively.
%
Additional deformation that can develop at the upper pier end results from
elongation of the steel
reinforcement at beam--column joint with the superstructure and is modeled
using a simplified hinge model (Mazzoni~\cite{Mazzoni2004}).
% As discussed by \citet{Mazzoni2004}, bar slip occurs
% in two modes: elongation due to the variation in strain along the length of the
% anchored bar resulting from bond to the surrounding concrete, and rigid body
% slip of the bar that is resisted by friction from the surrounding concrete.
% %
% A bi-uniform bond stress distribution was
% assumed along the length of the anchored bar based on the simplified model
% developed by \citet{Lehman1998}.
Parameters used for steel--concrete bond stress
distribution were $u_e=12\sqrt{f_c^{\prime}}$ and
$u_e=6\sqrt{f_c^{\prime}}$ (Lehman
and Moehle~\cite{Lehman1998}).
The bent cap beams were modeled as linear elastic beam-column elements with
geometric properties developed using effective width of the
cap beam and reduction of its stiffness due to cracking.
%
The superstructure consists of prismatic prestressed concrete members. and
was also modeled as linear elastic beam--column element.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Coupling of Structural and Soil Models.}
\label{couplingSection}
In order to create a model of a complete soil--structure system, it was
necessary to couple structural and soil (solid) finite elements.
%
Figure \ref{coupling} shows schematics of coupling between structural
(piles) and solid (soil) finite elements
%
%
%
\begin{figure}[!htbp]
\begin{center}
{\includegraphics[width=8cm]{/home/jeremic/tex/works/Conferences/2005/OpenSeesWorkshopAugust/UsersExample/SFSIModelSetup02.pdf}}
\caption{\label{coupling} Schematic description of coupling of structural elements (piles)
with solid elements (soil).}
\end{center}
\end{figure}
%
The volume that would be physically occupied by the pile is left open within the
solid mesh that models the foundation soil. This opening (hole) is excavated
during a staged construction process (described later).
Beam--column elements (representing piles) are then placed in the middle of this
opening. Beam--column elements representing pile are connected to the
surrounding solid (soil) elements be means of stiff short elastic beam--column
elements. These short "connection" beam--column elements extend from each pile
beam--column node to surrounding nodes of solids (soil) elements. The
connectivity of short, connection beam--column element nodes to nodes of soil
(solids) is done only for translational degrees of freedom (three of them for
each node), while the rotational degrees of freedom (three of them) from the
beam--column element are left unconnected.
% Connecting piles to soil using above described method has a number of advantages
% and disadvantages. On the positive side, the geometry of the soil--pile system
% is modeled very accurately. A thin layer of elements next to pile is used to
% mimic frictional behavior soil--pile interface. In addition to that, the
% deformation modes of a pile (axial, bending, shearing) are accurately transferred
% to surrounding soil by means of connection beam--column elements. In addition to
% that, both the pile and the soil are modeled using best available finite
% elements (nonlinear beam--column for pile and elastic--plastic solids for soil).
% On the negative side, the discrepancy of displacement approximation fields
% between pile ( a nonlinear beam--column) and soil (a linear solid brick
% elements) will lead to incompatibility of displacements between nodes of
% pile--soil system.
% However, this incompatibility was deemed acceptable in the view of advantages
% described above.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Application of Earthquake Motions}
Seismic ground motions were applied to the SSI finite element model using
Domain Reduction Method (DRM, Bielak et al.~\cite{Bielak2001} and
Yoshimura et al~\cite{Yoshimura2001}).
%
The DRM is an excellent method that can consistently
apply ground motions to the finite element model.
%
The method
features a two-stage strategy for complex, realistic three dimensional
earthquake engineering simulations. The first is an auxiliary
problem that simulates the earthquake source and propagation
path effects with a model that encompasses the source and a
free field (from which the soil--structure system has been
removed). The second problem models local, soil-structure effects. Its
input is a set of effective forces, that are derived from the
first step. These forces act only within a single layer of
elements adjacent to the interface between the exterior region
and the geological feature of interest.
%
While the DRM allows for application of arbitrary, 3D wave fields to the finite
element model, in this study a vertically propagating wave field was used.
Using given surface, free field ground motions, de-convolution was done for
this motions to a depth of 100~m.
Then, a vertically propagating wave field was (re--) created and used to create
the effective forces for DRM. Deconvolution and
(back) propagation of vertically propagating wave field was performed using
closed form, 1D solution as implemented in Shake
program (Idriss and Sun~\citep{SHAKE91}).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Staged Simulations}
\label{staged}
Application of loads in stages is essential for elastic--plastic models. This is
especially true for models of soil and concrete.
%
Staged loading ensures appropriate initial conditions for each loading stage.
%
Modeling starts from a zero stress and deformation state. Three loading stages,
described below, then follow.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Soil Self--Weight Stage.}
%
During this stage the finite element model for soil (only, no structure) is
loaded with soil
self--weight. The finite element model for this stage excludes any structural
elements, the opening (hole) where the pile will be placed is full of soil.
Displacement boundary conditions on the sides of the three soil blocks are such
that they allow vertical movements, and allow horizontal in boundary plane
movement, while they prohibit out of boundary plane movement of soils. All
the displacements are suppressed at the bottom of all three soil blocks. The
soil self weight is applied in 10 incremental steps.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Piles, Columns and Superstructure Self--Weight Stage.}
%
In this, second stage, number of changes to the model happen.
First, soil elements where piles will be placed are removed (excavated), then
concrete piles (beam--column elements) are placed in the holes (while
appropriately connecting structural and solids degrees of freedom, as
described above),
columns are placed on top of piles and finally the
superstructure is placed on top of columns. All of this construction is done at
once. With all the components in place, the self weight analysis of the
piles--columns--superstructure system is performed.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Seismic Shaking Stage.}
The last stage in our analysis consists of applying seismic shaking, by means
of effective forces using DRM. It is important to note that seismic
shaking is applied to the already deformed model, with all the stresses,
internal variables and deformation that resulted from first two stages of
loading.
%
%set NumSteps 3000
%set Dt 0.02
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{SIMULATION RESULTS}
\label{results}
Bridge model described above was used to analyze a number of cases of
different foundation soils and earthquake excitations.
%
Two sets of ground motions were used for the same bridge structure.
Variation of foundation soil, namely (a) all stiff sand and (b) all
soft clay. One of the main goals was to investigate if free field
motions can be directly used for structural model input (as is almost
exclusively done nowadays), that is, to investigate how significant are the SFSI
effects. In addition to that, investigated were differences in structural
response that result from varying soil conditions.
%
%
Ground motions for Northridge and Kocaeli earthquakes (free field
measurement, see Figure \ref{Motions}) were used in determining appropriate wave
field (using DRM).
%
Since the main aim of the exercise was to investigate SFS system a set of
short period motions were chosen among Northridge motions records, while long
period motions from Kocaeli earthquakes were used for long period example.
%
%
\begin{figure}[!htbp]
\begin{center}
%\includegraphics[width=7.5truecm]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/version_OKT/Images/InputMotion_Northridge.pdf}
\includegraphics[width=13truecm]{InputMotion_Northridge.pdf}
\\
%\hfill
%\includegraphics[width=8.0truecm]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/version_OKT/Images/LongMotion/InputMotion_Kocaeli.pdf}
\includegraphics[width=13truecm]{InputMotion_Kocaeli.pdf}
%\vspace*{-0.5cm}
\caption{\label{Motions} Input motions: short period (Northridge) and long period (Kocaeli).}
\end{center}
\end{figure}
%
A number of very interesting results were obtained and are discussed
below.
%while observing difference between free field motions and actual
%(simulated) motions observed at the bottom at of piers, and the bending moment
%response.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Free Field vs. SSI Motions.}
A very important aspect of SFSI is the difference between free field
motions and the motions that are changed (affected) but the presence of the
structure. Figure \ref{FreeField_vs_real_shortperiod} shows comparison of free
field short period motions (obtained by vertical propagation of earthquake
motions through the model without the presence of bridge structure and piles)
and the ones recorded at the base of column of the left bent in stiff and soft
soils.
%
\begin{figure}[!htbp]
\begin{center}
%\includegraphics[width=10cm]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/version_OKT/Images/DispSoilBlock1_15cm_25s_SC.pdf}
\includegraphics[width=12cm]{DispSoilBlock1_15cm_25s_SC.pdf}
%\vspace*{-0.5cm}
\caption{\label{FreeField_vs_real_shortperiod}
Comparison of free field versus measured (prototype model) motions at the base of
left bent for the short period motions (Northridge) for all clay (CCC) and all sand (SSS) soils.}
\end{center}
\end{figure}
%
It is immediately obvious that the free field motions in this case do not
correspond to motions observed in bridge SFS system with stiff
or soft soils. In fact, both the amplitude and period are significantly altered
for both soft and stiff soil and the bridge structure.
%
This quite different behavior can be explained by taking into account the
fact that the short period motions excite natural periods of stiff soil and
can produce (significant) amplifications. In addition to that, for soft soils,
significant elongation of period is observed.
On the other hand, as shown in Figure \ref{FreeField_vs_real_longperiod} the
same SFS system (same structure with stiff or soft soil beneath) responds quite
a bit different to long period motions.
%
\begin{figure}[!htbp]
\begin{center}
%\includegraphics[width=10cm]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/version_OKT/Images/LongMotion/DispSoilBlock1.pdf}
\includegraphics[width=12cm]{DispSoilBlock1.pdf}
%\vspace*{-0.5cm}
\caption{\label{FreeField_vs_real_longperiod}
Comparison of free field versus measured (prototype model) motions at the base
of left bent for the long period motions (Kocaeli) for all clay (CCC) and all
sand (SSS) soils.}
\end{center}
\end{figure}
%
The difference between free field
motions and the motions measured (simulated) in stiff soils is smaller in this
case. This
is understandable as the stiff soil virtually gets carried away as (almost) rigid
block on such long period motions. For the soft soil one of the predominant
natural periods of the SFS system is excited briefly (at 12-17 seconds) but other
than that excursion, both stiff and soft soil show fair matching with
free field motions. In this case the SFS effects are not that
pronounced, except during the above mentioned period between 12 and 17 seconds.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\paragraph{Bending Moments Response.}
Influence of variable soil conditions and of dynamic characteristic of
earthquake motions on structural response is followed by observing bending moment
response. For this particular purpose, a time history of bending moment at the
top of one of the piers of bent \#~1 (left most in Figure
\ref{BridgeSFSI01}) is chosen to illustrate differences in
behavior.
Figure~\ref{fig:m_scB1} shows time history of the bending moment at top of left
most pier of bent \#~1 for all sand (SSS) and all clay (CCC) cases for short
period motion (Northridge).
%
\begin{figure}[!htbp]
\begin{center}
\includegraphics[width=0.8\textwidth]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/Verzija_Februar/Images/MomentBent1Pile1_25s_SC.pdf}
\caption{\label{fig:m_scB1}
Simulated bending moment time series (top of left pier) for short period motion
(Northridge), for all
clay (CCC) and all sand (SSS) soils.}
\end{center}
\end{figure}
Similarly, Figure~\ref{fig:accel_B11_long} shows time history of the bending
moment for same pier, for same soil conditions, but for long
period motion (Kocaeli).
%
\begin{figure}[!htbp]
\begin{center}
\includegraphics[width=0.8\textwidth]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/Verzija_Februar/Images/LongMotion/MomentBent1Pile1.pdf}
%\includegraphics[width=0.7\textwidth]{/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/Verzija_Februar/Images/LongMotion/MomentBent1Pile2.pdf}
\caption{\label{fig:accel_B11_long}
Simulated bending moment time series (top of left pier) for long period motion (Kocaeli), for all
clay (CCC) and all sand (SSS) soils.}
\end{center}
\end{figure}
%
Time histories of bending moments are quite different for both types of soil
conditions (SSS and CCC) and for two earthquake motions. For example, it
can be seen from Figure~\ref{fig:m_scB1} that short period motion earthquake, in
stiff soil (SSS) produces (much) larger plastic deformation, which can be
observed by noting flat
plateaus on moment -- time diagrams, representing plastic hinge
development. Those plastic hinge development regions are developing
symmetrically, meaning that both sides of the pier have yielded and full plastic
hinge has formed.
%
On the other hand, the short period earthquake in soft soil (CCC)
produces very little damage, one side of a plastic hinge is (might be) forming
between 14 and 15 seconds.
Contrasting those observation is time history of bending moments in
Figure~\ref{fig:accel_B11_long}, where, for a long period motion, stiff soil
(SSS)
induces small amount of plastic yielding (hinges) on top of piers. However, soft
soil (CCC) induces a (very) large plastic deformations. Development of plastic
hinges for a structure in soft soil also last very long (more than two seconds,
see lower plateau for CCC case in Figure~\ref{fig:accel_B11_long}) resulting in
significant damage development in thus formed plastic hinges.
%
Observed behavior also somewhat contradicts common assumption that soft soils
are much more detrimental to structural behavior. It is actually the interaction
of the dynamic characteristic of earthquake, soil and structure (ESS) that
seem to control the ultimate structural response and the potential damage that
might develop.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{SUMMARY}
In this paper, an algorithm, named the Plastic Domain Decomposition (PDD),
for parallel elastic--plastic finite element computations was presented.
Presented was also a parallel scalability study, that shows how PDD scales
quite well with increase in a number of compute nodes. More importantly,
presented details of PDD reveal that scalability is assured for inhomogeneous,
multiple generation parallel computer architecture, which represents majority
of currently available parallel computers.
Presented also was an application of PDD
to soil--foundation--structure interaction problem for a bridge system and
Earthquake--Soil--Structure (ESS) interaction effects were emphasized. The
importance of the (miss--) matching of the ESS characteristics to the dynamic behavior of
the bridge soil--structure system was shown on an example using same structure, two
different earthquakes (one with short and one with long predominant periods) and
two different subgrade soils (stiff sand and soft clay)
The main goal of this paper is to show that high fidelity,
detailed modeling and simulations of geotechanical and structural systems are
available and quite effective. Results from such high fidelity modeling and
simulation shed light on new types of behavior that cannot be discovered using
simplified models. These high fidelity models tend to reduce modeling
uncertainty, which (might) allow practicing engineers to use simulations tools
for effective design of soil--structure systems.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{ACKNOWLEDGMENT}
%
The work presented in this paper was supported in part by a grant from the
Civil and Mechanical System program, Directorate of Engineering of the National
Science Foundation, under Award NSF--CMS--0337811 (cognizant program director
Dr. Steve McCabe)
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\renewcommand{\baselinestretch}{0.8}
\small % trick from Kopka and Daly p47.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\bibliography{../refmatthias}
%\bibliography{refmech,refcomp,/home/jeremic/tex/works/Thesis/MatthiasPreisig/thesis/refmatthias}
%\bibliography{refmech,refcomp,auxSFSI,/home/jeremic/tex/works/Thesis/MatthiasPreisig/thesis/refmatthias,/home/jeremic/tex/works/Thesis/GuanzhouJie/thesis/Verzija_Februar/ref}
\bibliography{refmech,refcomp,auxSFSI,refGuanzhou,refMatthias,auxCompPlatform.bib,Nima.bib}
%\bibliographystyle{unsrtnat}
%\bibliographystyle{plainnat}
\bibliographystyle{plain}
%++++
\end{document}
\bye