Primary interface¶
-
class
drudge.
Drudge
(ctx: pyspark.context.SparkContext, num_partitions=True)[source]¶ The main drudge class.
A drudge is a robot who can help you with the menial tasks of symbolic manipulation for tensorial and noncommutative alegbras. Due to the diversity and non-uniformity of tensor and noncommutative algebraic problems, to set up a drudge, domain-specific information about the problem needs to be given. Here this is a base class, where the basic operations are defined. Different problems could subclass this base class with customized behaviour. Most importantly, the method
normal_order()
should be overridden to give the commutation rules for the algebraic system studied.-
__init__
(ctx: pyspark.context.SparkContext, num_partitions=True)[source]¶ Initialize the drudge.
Parameters: - ctx – The Spark context to be used.
- num_partitions – The preferred number of partitions. By default, it is the default parallelism of the given Spark environment. Or an explicit integral value can be given. It can be set to None, which disable all explicit load-balancing by shuffling.
-
ctx
¶ The Spark context of the drudge.
-
num_partitions
¶ The preferred number of partitions for data.
-
full_simplify
¶ If full simplification is to be performed on amplitudes.
It can be used to disable full simplification of the amplitude expression by SymPy. For simple polynomial amplitude, this option is generally safe to be disabled.
-
simple_merge
¶ If only simple merge is to be carried out.
When it is set to true, only terms with same factors involving dummies are going to be merged. This might be helpful for cases where the amplitude are all simple polynomials of tensorial quantities. Note that this could disable some SymPy simplification.
Warning
This option might not give much more than disabling full simplification but taketh away many simplifications. It is in general not recommended to be used.
-
set_name
(obj: typing.Any, label: typing.Any = None)[source]¶ Set the object into the name archive of the drudge.
The str form of the given label is going to be used for the name of the object when given, or the str form of the object itself will be used.
-
names
¶ The name archive for the drudge.
The name archive object can be used for convenient accessing of objects related to the problem.
-
inject_names
(prefix='', suffix='')[source]¶ Inject the names in the name archive into the current global scope.
This function is for the convenience of users, especially interactive users. Itself is not used in official drudge code except its own tests.
Note that this function injects the names in the name archive into the global scope of the caller, rather than the local scope, even when called inside a function.
-
set_dumms
(range_: drudge.term.Range, dumms, set_range_name=True, dumms_suffix='_dumms', set_dumm_names=True)[source]¶ Set the dummies for a range.
Note that this function overwrites the existing dummies if the range has already been given.
-
dumms
¶ The broadcast form of the dummies dictionary.
-
set_symm
(base, *symms, set_base_name=True)[source]¶ Set the symmetry for a given base.
Permutation objects in the arguments are interpreted as single generators, other values will be attempted to be iterated over to get their entries, which should all be permutations.
-
symms
¶ The broadcast form of the symmetries.
-
add_resolver
(resolver)[source]¶ Append a resolver to the list of resolvers.
The given resolver can be either a mapping from SymPy expression, including atomic symbols, to the corresponding ranges. Or a callable to be called with SymPy expressions. For callable resolvers, None can be returned to signal the incapability to resolve the expression. Then the resolution will be dispatched to the next resolver.
-
add_resolver_for_dumms
()[source]¶ Add the resolver for the dummies for each range.
With this method, the default dummies for each range will be resolved to be within the range for all of them. This method should normally be called by all subclasses after the dummies for all ranges have been properly set.
Note that dummies added later will not be automatically added. This method can be called again.
-
resolvers
¶ The broadcast form of the resolvers.
-
set_tensor_method
(name, func)[source]¶ Set a new tensor method under the given name.
A tensor method is a method that can be called from tensors created from the current drudge as if it is a method of the given tensor. This could give cleaner and more consistent code for all tensor manipulations.
The given function, or bounded method, should be able to accept the tensor as the first argument.
-
get_tensor_method
(name)[source]¶ Get a tensor method with given name.
When the name cannot be resolved, KeyError will be raised.
-
vec_colour
¶ The vector colour function.
Note that this accessor accesses the function, rather than directly computes the colour for any vector.
-
normal_order
(terms, **kwargs)[source]¶ Normal order the terms in the given tensor.
This method should be called with the RDD of some terms, and another RDD of terms, where all the vector parts are normal ordered according to domain-specific rules, should be returned.
By default, we work for the free algebra. So nothing is done by this function. For noncommutative algebraic system, this function needs to be overridden to return an RDD for the normal-ordered terms from the given terms.
-
sum
(*args, predicate=None) → drudge.drudge.Tensor[source]¶ Create a tensor for the given summation.
This is the core function for creating tensors from scratch. The arguments should start with the summations, each of which should be given as a sequence, normally a tuple, starting with a SymPy symbol for the summation dummy in the first entry. Then comes possibly multiple domains that the dummy is going to be summed over, which can be symbolic range, SymPy expression, or iterable over them. When symbolic ranges are given as
Range
objects, the given dummy will be set to be summed over the ranges symbolically. When SymPy expressions are given, the given values will substitute all appearances of the dummy in the summand. When we have multiple summations, terms in the result are generated from the Cartesian product of them.The last argument should give the actual thing to be summed, which can be something that can be interpreted as a collection of terms, or a callable that is going to return the summand when given a dictionary giving the action on each of the dummies. The dictionary has an entry for all the dummies. Dummies summed over symbolic ranges will have the actual range as its value, or the actual SymPy expression when it is given a concrete range. In the returned summand, if dummies still exist, they are going to be treated in the same way as statically-given summands.
The predicate can be a callable going to return a boolean when called with same dictionary. False values can be used the skip some terms. It is guaranteed that the same dictionary will be used for both predicate and the summand when they are given as callables.
-
einst
(summand) → drudge.drudge.Tensor[source]¶ Create a tensor from Einstein summation convention.
By calling this function, summations according to the Einstein summation convention will be added to the terms. Note that for a symbol to be recognized as a summation, it must appear exactly twice in its original form in indices, and its range needs to be able to be resolved. When a symbol is suspiciously an Einstein summation dummy but does not satisfy the requirement precisely, it will not be added as a summation, but a warning will also be given for reference.
-
create_tensor
(terms)[source]¶ Create a tensor with the terms given in the argument.
The terms should be given as an iterable of Term objects. This function should not be necessary in user code.
-
format_latex
(tensor, sep_lines=False)[source]¶ Get the LaTeX form of a given tensor.
Subclasses should fine-tune the appearance of the resulted LaTeX form by overriding methods
_latex_sympy
,_latex_vec
, and_latex_vec_mul
.
-
__weakref__
¶ list of weak references to the object (if defined)
-
-
class
drudge.
Tensor
(drudge: drudge.drudge.Drudge, terms: pyspark.rdd.RDD, free_vars: typing.Set[sympy.core.symbol.Symbol] = None, expanded=False, repartitioned=False)[source]¶ The main tensor class.
A tensor is an aggregate of terms distributed and managed by Spark. Here most operations needed for tensors are defined.
Normally, tensor instances are created from drudge methods or tensor operations. Direct invocation of its constructor is seldom in user scripts.
-
__init__
(drudge: drudge.drudge.Drudge, terms: pyspark.rdd.RDD, free_vars: typing.Set[sympy.core.symbol.Symbol] = None, expanded=False, repartitioned=False)[source]¶ Initialize the tensor.
This function is not designed to be called by users directly. Tensor creation should be carried out by factory function inside drudges and the operations defined here.
The default values for the keyword arguments are always the safest choice, for better performance, manipulations are encouraged to have proper consideration of all the keyword arguments.
-
terms
¶ The terms in the tensor, as an RDD object.
Although for users, normally there is no need for direct manipulation of the terms, it is still exposed here for flexibility.
-
local_terms
¶ Gather the terms locally into a list.
The list returned by this is for read-only and should never be mutated.
Warning
This method will gather all terms into the memory of the driver.
-
n_terms
¶ Get the number of terms.
A zero number of terms signatures a zero tensor. Accessing this property will make the tensor to be cached automatically.
-
cache
()[source]¶ Cache the terms in the tensor.
This method should be called when this tensor is an intermediate result that will be used multiple times. The tensor itself will be returned for the ease of chaining.
-
repartition
(num_partitions=None, cache=False)[source]¶ Repartition the terms across the Spark cluster.
This function should be called when the terms need to be rebalanced among the workers. Note that this incurs an Spark RDD shuffle operation and might be very expensive. Its invocation and the number of partitions used need to be fine-tuned for different problems to achieve good performance.
Parameters: - num_partitions (int) – The number of partitions. By default, the number is read from the drudge object.
- cache (bool) – If the result is going to be cached.
-
is_scalar
¶ If the tensor is a scalar.
A tensor is considered a scalar when none of its terms has a vector part. This property will make the tensor automatically cached.
-
free_vars
¶ The free variables in the tensor.
-
expanded
¶ If the tensor is already expanded.
-
repartitioned
¶ If the terms in the tensor is already repartitioned.
-
__str__
()[source]¶ Get the string representation of the tensor.
Note that this function will gather all terms into the driver.
-
latex
(sep_lines=False)[source]¶ Get the latex form for the tensor.
The actual printing is dispatched to the drudge object for the convenience of tuning the appearance.
Parameters: sep_lines (bool) – If terms should be put into separate lines by separating them with \\
.
-
apply
(func, **kwargs)[source]¶ Apply the given function to the RDD of terms.
This function is analogous to the replace function of Python named tuples, the same value from self for the tensor initializer is going to be used when it is not given. The terms get special treatment since it is the centre of tensor objects. The drudge is kept the same always.
Users generally do not need this method. It is exposed here just for flexibility and convenience.
Warning
For developers: Note that the resulted tensor will inherit all unspecified keyword arguments from self. This method can give unexpected results if certain arguments are not correctly reset when they need to. For instance, when expanded is not reset when the result is no longer guaranteed to be in expanded form, later expansions could be skipped when they actually need to be performed.
So all functions using this methods need to be reviewed when new property are added to tensor class. Direct invocation of the tensor constructor is a much safe alternative.
-
reset_dumms
(excl=None)[source]¶ Reset the dummies.
The dummies will be set to the canonical dummies according to the order in the summation list. This method is especially useful on canonicalized tensors.
Parameters: excl – A set of symbols to be excluded in the dummy selection. This option can be useful when some symbols already used as dummies are planned to be used for other purposes.
-
simplify_amps
()[source]¶ Simplify the amplitudes in the tensor.
This method simplifies the amplitude in the terms of the tensor by using the facility from SymPy. The zero terms will be filtered out as well.
-
simplify_deltas
()[source]¶ Simplify the deltas in the tensor.
Kronecker deltas whose operands contains dummies will be attempted to be simplified.
-
expand
()[source]¶ Expand the terms in the tensor.
By calling this method, terms in the tensor whose amplitude is the addition of multiple parts will be expanded into multiple terms.
-
sort
()[source]¶ Sort the terms in the tensor.
The terms will generally be sorted according to increasing complexity.
-
merge
()[source]¶ Merge terms with the same vector and summation part.
This function merges terms only when their summation list and vector part are syntactically the same. So it is more useful when the canonicalization has been performed and the dummies reset.
-
canon
()[source]¶ Canonicalize the terms in the tensor.
This method will first expand the terms in the tensor. Then the canonicalization algorithm is going to be applied to each of the terms. Note that this method does not rename the dummies.
-
normal_order
()[source]¶ Normal order the terms in the tensor.
The actual work is dispatched to the drudge, who has domain specific knowledge about the noncommutativity of the vectors.
-
simplify
()[source]¶ Simplify the tensor.
This is the master driver function for tensor simplification.
-
__eq__
(other)[source]¶ Compare the equality of tensors.
Note that this function only compares the syntactical equality of tensors. Mathematically equal tensors might be compared to be unequal by this function when they are not simplified.
Note that only comparison with zero is performed by counting the number of terms distributed. Or this function gathers all terms in both tensors and can be very expensive. So direct comparison of two tensors is mostly suitable for testing and debugging on small problems only. For large scale problems, it is advised to compare the simplified difference with zero.
-
__add__
(other)[source]¶ Add the two tensors together.
The terms in the two tensors will be concatenated together, without any further processing.
In addition to full tensors, tensor inputs can also be directly added.
-
__mul__
(other) → drudge.drudge.Tensor[source]¶ Multiply the tensor.
This multiplication operation is done completely within the framework of free algebras. The vectors are only concatenated without further processing. The actual handling of the commutativity should be carried out at the normal ordering operation for different problems.
In addition to full tensors, tensors can also be multiplied to user tensor input directly.
-
__or__
(other)[source]¶ Compute the commutator with another tensor.
In the same way as multiplication, this can be used for both full tensors and local tensor input.
-
subst
(lhs, rhs, wilds=None)[source]¶ Substitute the all appearance of the defined tensor.
When the given LHS is a plain SymPy symbol, all its appearances in the amplitude of the tensor will be replaced. Or the LHS can also be indexed SymPy expression or indexed Vector, for which all of the appearances of the indexed base or vector base will be attempted to be matched against the indices on the LHS. When a matching succeeds for all the indices, the RHS, with the substitution found in the matching performed, will be replace the indexed base in the amplitude, or the vector. Note that for scalar LHS, the RHS must contain no vector.
Since we do not commonly define tensors with wild symbols, an option
wilds
can be used to give a mapping translating plain symbols on the LHS and the RHS to the wild symbols that would like to be used. The default value of None could make all plain symbols in the indices of the LHS to be translated into a wild symbol with the same name and no exclusion. And empty dictionary can be used to disable all such automatic translation. The default value of None should satisfy most needs.
-
act
(lhs, tensor, wilds=None)[source]¶ Act on a tensor by substituting all its appearances.
This method is the active voice version of the
Tensor.subst()
function. Here the tensor object that is called on serves as the definition, and the argument gives the tensor to be replaced.
-
diff
(variable, real=False, wirtinger_conj=False)[source]¶ Differentiate the tensor to get the analytic gradient.
By this function, support is provided for evaluating the derivative with respect to either a plain symbol or a tensor component. This is achieved by leveraging the core differentiation operation to SymPy. So very wide range of expressions are supported.
Warning
For non-analytic complex functions, this function gives the Wittinger derivative with respect to the given variable only. The other non-vanishing derivative with respect to the conjugate needs to be evaluated by another invocation with
wittinger_conj
set to true.Parameters: - variable – The variable to differentiate with respect to. It should be either a plain SymPy symbol or a indexed quantity. When it is an indexed quantity, the indices should be plain symbols with resolvable range.
- real (bool) – If the variable is going to be assumed to be real. Real variables has conjugate equal to themselves.
- wirtinger_conj (bool) – If we evaluate the Wirtinger derivative with respect to the conjugate of the variable.
-