martes, 16 de noviembre de 2010

Fast Multiplication of Matrices with Decay. (arXiv:1011.3534v1 [cs.DS])

Fast Multiplication of Matrices with Decay. (arXiv:1011.3534v1 [cs.DS]): "

A fast algorithm for the approximate multiplication of matrices with decay is
introduced; the Sparse Approximate Matrix Multiply (SpAMM) reduces complexity
in the product space, a different approach from current methods that economize
within the matrix space through truncation or rank reduction. Matrix truncation
(element dropping) is compared to SpAMM for quantum chemical matrices with
approximate exponential and algebraic decay. For matched errors in the
electronic total energy, SpAMM is found to require fewer to far fewer floating
point operations relative to dropping. The challenges and opportunities
afforded by this new approach are discussed, including the potential for high
performance implementations.

"

An efficient method for the incompressible Navier-Stokes equations on irregular domains with no-slip boundary conditions, high order up to the boundary. (arXiv:1011.3589v1 [math.NA])

An efficient method for the incompressible Navier-Stokes equations on irregular domains with no-slip boundary conditions, high order up to the boundary. (arXiv:1011.3589v1 [math.NA]): "

Common efficient schemes for the incompressible Navier-Stokes equations, such
as projection or fractional step methods, have limited temporal accuracy as a
result of matrix splitting errors, or introduce errors near the domain
boundaries (which destroy uniform convergence to the solution). In this paper
we recast the incompressible (constant density) Navier-Stokes equations (with
the velocity prescribed at the boundary) as an equivalent system, for the
primary variables velocity and pressure. We do this in the usual way away from
the boundaries, by replacing the incompressibility condition on the velocity by
a Poisson equation for the pressure. The key difference from the usual
approaches occurs at the boundaries, where we use boundary conditions that
unequivocally allow the pressure to be recovered from knowledge of the velocity
at any fixed time. This avoids the common difficulty of an, apparently,
over-determined Poisson problem. Since in this alternative formulation the
pressure can be accurately and efficiently recovered from the velocity, the
recast equations are ideal for numerical marching methods. The new system can
be discretized using a variety of methods, in principle to any desired order of
accuracy. In this work we illustrate the approach with a 2-D second order
finite difference scheme on a Cartesian grid, and devise an algorithm to solve
the equations on domains with curved (non-conforming) boundaries, including a
case with a non-trivial topology (a circular obstruction inside the domain).
This algorithm achieves second order accuracy (in L-infinity), for both the
velocity and the pressure. The scheme has a natural extension to 3-D.

"

Domain Decomposition method on GPU cluster. (arXiv:1011.3318v1 [hep-lat])

Domain Decomposition method on GPU cluster. (arXiv:1011.3318v1 [hep-lat]): "

Pallalel GPGPU computing for lattice QCD simulations has a bottleneck on the
GPU to GPU data communication due to the lack of the direct data exchanging
facility. In this work we investigate the performance of quark solver using the
restricted additive Schwarz (RAS) preconditioner on a low cost GPU cluster. We
expect that the RAS preconditioner with appropriate domaindecomposition and
task distribution reduces the communication bottleneck. The GPU cluster we
constructed is composed of four PC boxes, two GPU cards are attached to each
box, and we have eight GPU cards in total. The compute nodes are connected with
rather slow but low cost Gigabit-Ethernet. We include the RAS preconditioner in
the single-precision part of the mixedprecision nested-BiCGStab algorithm and
the single-precision task is distributed to the multiple GPUs. The benchmarking
is done with the O(a)-improved Wilson quark on a randomly generated gauge
configuration with the size of $32^4$. We observe a factor two improvment on
the solver performance with the RAS precoditioner compared to that without the
preconditioner and find that the improvment mainly comes from the reduction of
the communication bottleneck as we expected.

"

Minimizing Communication for Eigenproblems and the Singular Value Decomposition. (arXiv:1011.3077v1 [math.NA])

Minimizing Communication for Eigenproblems and the Singular Value Decomposition. (arXiv:1011.3077v1 [math.NA]): "

Algorithms have two costs: arithmetic and communication. The latter
represents the cost of moving data, either between levels of a memory
hierarchy, or between processors over a network. Communication often dominates
arithmetic and represents a rapidly increasing proportion of the total cost, so
we seek algorithms that minimize communication. In \cite{BDHS10} lower bounds
were presented on the amount of communication required for essentially all
$O(n^3)$-like algorithms for linear algebra, including eigenvalue problems and
the SVD. Conventional algorithms, including those currently implemented in
(Sca)LAPACK, perform asymptotically more communication than these lower bounds
require. In this paper we present parallel and sequential eigenvalue algorithms
(for pencils, nonsymmetric matrices, and symmetric matrices) and SVD algorithms
that do attain these lower bounds, and analyze their convergence and
communication costs.

"

A probabilistic algorithm approximating solutions of a singular PDE of porous media type. (arXiv:1011.3107v1 [math.PR])

A probabilistic algorithm approximating solutions of a singular PDE of porous media type. (arXiv:1011.3107v1 [math.PR]): "

The object of this paper is a one-dimensional generalized porous media
equation (PDE) with possibly discontinuous coefficient $\beta$, which is
well-posed as an evolution problem in $L^1(\mathbb{R})$. In some recent papers
of Blanchard et alia and Barbu et alia, the solution was represented by the
solution of a non-linear stochastic differential equation in law if the initial
condition is a bounded integrable function. We first extend this result, at
least when $\beta$ is continuous and the initial condition is only integrable
with some supplementary technical assumption. The main purpose of the article
consists in introducing and implementing a stochastic particle algorithm to
approach the solution to (PDE) which also fits in the case when $\beta$ is
possibly irregular, to predict some long-time behavior of the solution and in
comparing with some recent numerical deterministic techniques.

"